CONCISE 
NCYCLOPEDIA OF 
\JATHEMATICS 


E 


th 4 “ if 
Se > nwt 
« 
- 
> x 
a | 
oo dies 
' 
- 
a 
= 
* 


Editors X/Gellert S.Gottwald — 
M. Hellwich H. Kastner H.KUstne 


THE VNR 
CONCISE 


ENCYCLOPEDIA OF 
A ATHEMATICS st 


THE VNR 


CONCISE 


E 


NCYCLOPEDIA OF 
ATHEMATICS comon 


Mi 


\X/ Gellert - S.Gottwald 
M. Hellwich - H. Kastner: H.Kustner 
Editors 


K A.Hirsch : H. Reichardt 


Scientific Advisors 


VAN NOSTRAND REINHOLD 
_CNew York 


© VEB Bibliographisches Institut Leipzig, 1975 
Softcover reprint of the hardcover 1st edition 1975 


Mathematics at a Glance 

First American Edition 1977 

Second American Edition 1989 

Library of Congress Catalog Card Number 88-26992 


ISBN-13: 978-94-011-69844 eISBN-13:978-94-011-6982-0 
DOI: 10.1007/978-94-01 1-6982-0 


All rights reserved. No part of this work covered by the copyright 
hereon may be reproduced or used in any form or by any means — 
graphic, electronic, or mechanical, including photocopying, 
recording, taping, or information storage and retrieval systems — 
without written permission of the publisher. 


Made in the German Democratic Republic. 


Published by Van Nostrand Reinhold 
115 Fifth Avenue 
New York, New York 10003 


Van Nostrand Reinhold International Company Limited 
11 New Fetter Lane 
London EC4P 4EE, England 


Van Nostrand Reinhold 
480 La Trobe Street 
Melbourne, Victoria 3000, Australia 


Macmillan of Canada 

Division of Canada Publishing Corporation 
164 Commander Boulevard 

Agincourt, Ontario MIS 3C7, Canada 


16 15 14 138 12 11 10 9 8 7 65483 2 +1 


Library of Congress Cataloging-in-Publication Data 


Main entry under title: 
The VNR concise encyclopedia of mathematics. 


First published under title: Mathematics at a glance. 
Includes index. 

|. Mathematics-Handbooks, manuals, etc. |. Gottwald, S. 
Il. Van Nostrand Reinhold Company. 


QA40.VI8 1989 510—dc19 88-26992 


Contents 


Introduction ...... 00. cc ccc ccc cc cc cece cece eee eee eee cece ee eesecseccesceces 11 


I. Elementary mathematics 


1. Fundamental operations on rational numbers .............. ccc cece cece cece ce eces 17 
2. Higher arithmetical operations ........... 0c ccc cece cece cece eee e cece ee eeececees 47 
3. Development of the number system............ cc ccc cece cee ccc cece cece eeceees 69 
4. Algebraic equations ......... ccc ccc ccc cc crete tence eee e eee eeeeeseeceeseens 80 
5. J SU 5 os 6 (0) 0 ln 107 
6. Percentages, interest and annuities..... 0... 0. ccc ccc ccc cee c reece eee ccees 139 
7. Plane ZeOMetry ...... ccc ccc ccc cece wee cece eee cee cece eee e sees eeesceeee 146 
8. Solid geometry. .... cc ccc ccc ce ccc eee e eee tee e cece eee sees ee eseeecees 184 
9. Descriptive geometry ......... ccc cece cece cece cece teen ete ee eee eeeteeesceeece 203 
10. TTIQOMOMELLY ... 1. cc cece cece eee cette eee e nett eset ee eee eeeeseeees 220 
11. Plane trigONOMetIry ........ cece ccc cece ee cece eee e eee cece eee ee te eee ccees 241 
12. Spherical trigonometry .......... ccc ccc cc cece ect e cece eet e ese eeeseeseeesseees 261 
13. Analytic geometry of the plane... 1.0... ccc ccc cece neces eee eeeceeees 282 
II. Steps towards higher mathematics 

14. Set theory... 0... ccc ccc ccc ccc ccc cee cee eect eee ec ee eee eee ee tees eee seeeees 320 
15. The elements of mathematical logic ......... 0... cc ccc cece eee cece cece eeeeees 332 
16. Groups and fieldS ..... 0... ccc ccc ccc ce cee cect cece eect ress eee sees scenes 343 
17. Linear algebra ..... 1... ccc ccc ccc ce cee cece ee ete eee eee eee eeeeeeenees 356 
18. Sequences, series, limits ......... ccc ccc ccc ccc cece ce eee cece eee ee eee ee teense 381 
19. Differential calculus ........ 0... ccc ccc cc cece cere cee eee eee ee eee eee eeeee 406 
20. Integral calculus ........ ccc ccc ccc ccc eee tte eee teen eee e eee eee eeeeseees 443 
21. Series Of FUNCTIONS. ..... 0... ccc cc cee eect ete eee eee ee eee eee e teen eeeees 479 
22. Ordinary differential equations ........... ccc cc cece cee e ccc e cece eee eeeeeeeees 500 
23. Complex analysis ......... ccc cece cece cece cece eee tence cece cress esses eeseeees 517 
24. Analytic geometry Of Space ....... ccc ccc ccc cee ence eee ee scene eeeesees 530 
25. Projective ZEOMEtry ...... ccc ccc ct eee eee ee eee eee eee teeter e eee eeeeees 547 
26. Differential geometry, convex bodies, integral geometry ..............cccccceecees 561 
27, Probability theory and statistics ......... cc ccc cece eee c cece wetter eee ee eeenes 575 
28. Calculus of errors, adjustment of data, approximation theory ................02000. 607 
29. Numerical analysis ...... 0.0... ccc ccc ccc cee ccc e cece ec eee cee e ee teeesesenes 630 
30. Mathematical optimization ........... 0. ccc ccc cece cece tte e cece eee eeeeeeeeees 653 
III. Brief reports on selected topics 

31. Number theory ........... ccc ccc cece cece eee cece eee eee cess ee see esse seees 669 
32. Algebraic geometry ........ ccc ccc ccc cc ce tere eee tee e eee e teste eee eeeseeees 675 
33. Further algebraic structures ........ 0... cece cc cece cece cee cect cece cece ceuce 678 
34. 0) 00) (0) a 680 
35. Measure theory ......... ccc cc ccc cece cee c ee tee eee e eee ee eee eects ee eceeees 687 
36. Graph theory ........ ccc cc ccc ccc cc cece cece cence eee e sete cess eseseeesees 688 
37. Potential theory and partial differential equations .............. cc cece cece eee eees 693 
38. Calculus of variations. ......... ccc ccc ccc ce ccc cece cece eect eee eeececees 698 
39. Integral equations ......... ccc ccc ccc cc cc ccc cece cece eee c cece eeececeeseeees 703 
40. Functional analysis .......... 0c c ccc cc cece cee cece cece cece cence eccceceeeeees 705 
41. Foundation of geometry — Euclidean and non-Euclidean geometry ...............- 711 
42. Foundations of mathematics ............ 0. cc cece cece cece cece cee ce cee eeeeees 717 
43. Game theory ........ ccc ccc cece ce cee eee eect eee e cece eee es seeeeeses 723 
44. Perturbation theory ........... ccc ccc ccc cece cece eee e ee tec e te ee ee eeeeeeeeees 731 
45. The pocket calculator ........ 0... ccc cc ccc cece cece tence cece cesses seeees 732 
46. MICFOCOMPUTErS ....... ccc ccc cee cee tee eee eee eee eee cece eect eeeeeeees 745 


Preface 


It is commonplace that in our time science and technology cannot be mastered without the tools 
of mathematics; but the same applies to an ever growing extent to many domains of everyday life, 
not least owing to the spread of cybernetic methods and arguments. As a consequence, there is a 
wide demand for a survey of the results of mathematics, for an unconventional approach that would 
also make it possible to fill gaps in one’s knowledge. We do not think that a mere juxtaposition of 
theorems or a collection of formulae would be suitable for this purpose, because this would over- 
emphasize the symbolic language of signs and letters rather than the mathematical idea, the only 
thing that really matters. Our task was to describe mathematical interrelations as briefly and precisely 
as possible. In view of the overwhelming amount of material it goes without saying that we did not 
just compile details from the numerous text-books for individual branches: what we were aiming at 
is to smooth out the access to the specialist literature for as many readers as possible. Since well 
over 700000 copies of the German edition of this book have been sold, we hope to have achieved 
our difficult goal. 

Colours are used extensively to help the reader. Important definitions and groups of formulae 
are on a yellow background, examples on blue, and theorems on red. The course of more complic- 
ated calculations is indicated by red arrows. Also, in the illustrations in the text colours show up 
the essential features. Ample examples help to make general statements understandable. Frequently 
the numerical calculations have been arranged separately so that a problem can be read as an 
explanatory text, without reference to calculations, while the latter can be regarded as worked exam- 
ples with explicit details. Physical units, which occur in some examples, are given in the SI-system, 
which is coming more and more into legal and practical use. Everyday examples are given in everyday 
units, both metric and others. 

A systematic subdivision of the material, many brief section headings, and tables are meant to 
provide the reader with quick and reliable orientation. The detailed index to the book gives an easy 
access to specific questions. 

In the plates at the end numerous photographs and colour plates help to make the material more 
vivid and provide interesting glimpses of the history of mathematics. 

We thank the authors of the various chapters, specially to acceding to our request for generally 
understandable diction even at the risk of deviating from the usual terminology. Above all in the 
brief reports on special topics many an author has found it difficult to be content with mere in- 
dications about a topic in which he is an expert. 

Our particular thanks are due to our advisors, Professor K. A. Hirsch, Queen Mary College, 
University of London, and Professor H. Reichardt, Section for Mathematics, Humboldt University 
of Berlin. They have worked untiringly for the improvement of the book and have helped to create 
a work which is a reliable source of information for every user and should convince everyone that 
mathematics is essentially a simple and learnable discipline. 


The Editors and the Publishers 


1 


Plates 


Archimedes 
Poster of the town of Syracuse (Italy) 


2/3 Mathematics in school I/II 


aomNlUNDNCUW 


10 


11 


12 


13 


14 


15 


16 


Introduction of the number seven and 
revision exercises 

Mathematics in industrial arts 

Surfaces of revolution in the design of a 
pottery set 

Drawing instruments I 

Geometry sets 

Drawing instruments II 

Slide rules 

Drawing instruments III 

Rulers, protractors, and French curves 
Graph papers 

Millimetre paper - doubly logarithmic 
paper — simply logarithmic paper - polar 
coordinate paper — triangular net paper — 
probability paper 

From the earliest period of mathematics 
Clay vessels of the new stone age 

Early Egyptian surveying 

Ancient Egyptian mathematics 

Original text of the Hau problem in Demotic 
writing and transcription of the same text 
into hieroglyphics 

Calculation of a frustum of a pyramid 
Babylonian mathematics 

Cuneiform tablet with calculations of areas 
Section of the tablet above 

Graeco-Roman mathematics 

The Elements of Euclid, first printed edition 
1482 

Roman hand abacus 

Ancient Chinese mathematics 

From a manuscript dated 1303 

Bamboo sticks to represent numbers 
Chinese slide rule (about 1600) 

Ancient Hindu mathematics 
Mathematical-astronomical buildings of the 
17th century 

Mathematical manuscript of the 16th 
century 

Arabic mathematics 

Theorem of Pythagoras in an Arabic 
mathematical manuscript of the 14th 
century 

Arabic astrolabe 

Mathematics in Europe, 15th to 17th century 
Triumph of the modern algorithm (digital 
calculation) over the ancient counter reckon- 
ing (abacus) 

The use of Jacob’s staff 


17 


18 


19 


20 


21 


22 


23 


24 


25 


26 


27 


Mathematics and the visual arts I 

Ancient Egyptian mural: catching fish and 
hunting birds in a papyrus thicket 

Painting by Melozzo da Forli (1438-1494): 
Pope Sixtus IV appoints Platina as Prefect 
of the Vatican 

Mathematics and the visual arts II 
Proportions of the human body 

Drawings by Leonardo da Vinci and sketch 
by Albrecht Direr 

Mathematics and the visual arts III 
Melancholia, copper engraving by Albrecht 
Direr. 

Geometric forms in architecture and 
technology I 

Egyptian pyramids near Giza 

Tower of city walls 

The old town hall of Leipzig 

Geometric forms in architecture and tech- 
nology II 

Modern water tower 

Cooling towers of a generating plant 
Geometric forms in architecture and tech- 
nology III 

Obelisk in the great temple of Amun at 
Karnak 

Wedge as a cleaving tool 

Hyperbolic paraboloid shells as roofs of 
an exhibition hall 

Famous mathematicians of the 15th/16th 
century 

Regiomontanus — Simon Stevin — Albrecht 
Direr -— Niccolo Tartaglia — Geronimo 
Cardano — Jost Birgi — Luca Pacioli 
Famous mathematicians of the 16th century 
Title page of Robert Recorde’s ‘ Algebra’ 
Title page of Adam Ries’s ‘Rechnung auff 
der Linihen und Federn ...’ 

A problem out of this book concerning the 
purchase of livestock 

From old arithmetic books 

Conclusion of a business deal at a calculat- 
ing desk 

Calculation of the capacity of a cask 

Two libraries 

The mathematics room of the National and 
University Library in Prague _—. 

Entrance to the Science Library of Erfurt 
(Boyneburg portal) 

Old mathematical aids I 

Pedometer, 1741 

Slit bamboo as counting stick 

Tally stick 


28 


29 


31 


32 


33 


34 


35 


36 


37 


38 


39 


41 


Old mathematical aids II 

Counters or markers for arithmetic and an 
elaborate box, 16th century 

Old mathematical aids ITI 

Surveyor’s compass, about 1600 

Old measures I 

Illustration of a rod, by juxtaposition of 
16 feet 

16th century measuring rods with various 


Set of weights, Nuremberg 1588 

Hinged sun dial, ivory 

Famous mathematicians of the 17th 

century I 

Title page of Descartes’ ‘Discours de la 
méthode’ 

René Descartes 

Famous mathematicians of the 17th 

century II 

Francois Vieta — John Napier — Galileo 
Galilei -— Johannes Kepler — Buonaventura 
Cavalieri — Pierre de Fermat — James 
Gregory 

Famous mathematicians of the 17th/18th 
century I 

Blaise Pascal 

Gottfried Wilhelm Leibniz 

Isaac Newton 

Famous mathematicians of the 17th/18th 
century II 

Extract from a manuscript of Leibniz with 
the integral sign 

The mechanical calculator constructed by 
Pascal in 1642 

Famous mathematicians of the 17th/18th 
century II 

Jakob Bernoulli 

Johann Bernoulli 

Daniel Bernoulli 

Famous mathematicians of the 18th 

century I 

Page from a manuscript by Euler 
Leonhard Euler 

Famous mathematicians of the 18th 

century II 

Brook Taylor — Moreau Maupertuis —- 
Johann Heinrich Lambert — Joseph Louis 
Lagrange — Gaspard Monge — Adrien Marie 
Legendre — Jean Baptiste Joseph de Fourier 
Famous mathematicians of the 19th 

century I 

Drawing by Janos Bélyai on non-Euclidean 
geometry 

Nikolai Ivanovich Lobachevskii 

Famous mathematicians of the 19th 

century II 

Portrait of the young Gauss 

Gauss in his old age 

Gauss’s Signature 

The University in Gottingen 

Famous mathematicians of the 19th 

century II 

A page from Gauss’s scientific diary 


42 


43 


45 


47 


48 


49 


50 


31 


52 


53 


Famous mathematicians of the 19th 

century IV 

Friedrich Wilhelm Bessel — Augustin Louis 
Cauchy — Jakob Steiner — Niels Henrik 
Abel — Peter Gustav Lejeune Dirichlet — 
Evariste Galois — Pafnuti Lvovich Cheby- 
shev 


Famous mathematicians of the 19th 
century V 
Carl Gustav Jacob Jacobi — Bernhard 


Riemann — Leopold Kronecker — Karl 
WeierstraB — Arthur Cayley — Sophus Lie — 
Sonya Kovalevskay 

Mathematical instruments I 

Instrument for drawing an integral curve 
of a given function or differential equation 
Instrument to evaluate the integral of a 
function whose graph is given 
Mathematical instruments II 

Compensating polar planimeter with polar 
arm 

Compensating polar planimeter with polar 
carriage 

Mathematical instruments III 

Precision pantograph 

Instrument for the measurement of rect- 
angular coordinates or the drawing of 
points with given coordinates 
Mathematical instruments IV 

Harmonic analyser 

Instrument to determine the tangent or 
normal to a curve whose graph is given 
Famous mathematicians of the 19th/20th 
century I 

George Stokes — Richard Dedekind — Georg 
Frobenius — Georg Cantor — Henri Poin- 
caré — Felix Klein — Emmy Noether 
Famous mathematicians of the 19th/20th 
century II 

David Hilbert — Elie Joseph Cartan — Henri 
Léon Lebesgue — John von Neumann - 
Hermann Weyl - Jacques Hadamard - 
Stefan Banach 

Surveying 

Signals for the observation of trigonometric 
nets 

Trigonometric point (TP) 

Mathematical education I 

Work on a wall board 

Determination of an angle with a hand-made 
apparatus 

Giant slide rule for instructional purposes. 
Mathematical education IT 

Computations on part of an exhaust system 
Geometrical constructions on the black- 
board 

Application of Pythagoras’ theorem 
Mathematical education III 

Models for pupils: Cube with surface and 
space diagonals — Prism decomposable into 
three pyramids of equal volume — Cylinder 
with sections — Sphere with plane sections — 
Sections of a right circular cone 


54 Mirror images 


55 


Negative and positive of a photograph 
Reflection in water 

Ship’s Diesel engine in a left- and right- 
hand version 

Variational problems 

Formation of a minimal surface in a lobster 
pot 

Formation of a minimal surface by asoap film 


Index of mathematicians 


Abel, Niels Henrik, 1802-1829 
d*Alembert, Jean le Rond, 1717-1783 
Apollonius of Perga, c. 262-190 ? B. C. 
Archimedes, 287?-212 B. C. 

Argand, Jean Robert 1768-1832 
Aristotle, 384—322 B. C. 

Banach, Stefan, 1892—1945 

Beltrami, Eugenio, 1835-1900 
Bernoulli, Daniel, 1700-1782 
Bernoulli, Jakob, 1654-1705 
Bernoulli, Johann, 1667-1748 

Bessel, Friedrich Wilhelm, 1784-1846 
Bezout, Etienne, 1730—1783 
Bhaskara, 1114-1185? 

Birkhoff, George David, 1884-1944 
Blaschke, Wilhelm, 1885-1962 
Bolyai, Farkas, 1775-1856 

Bolyai, Janos, 1802—1860 

Bolzano, Bernard, 1781-1848 
Bombelli, Rafael, 16. century 
Bahmagupta, born 598 

Briggs, Henry, 1561-1630 

Brouwer, Luitzen Egbertus Jan, 1881-1966 
Buffon, Georges Louis de, 1707-1788 
Birgi, Jost, 1552-1632 

Burnside, William, 1852-1927 
Cantor, Georg, 1845-1918 
Carathéodory, Constantin, 1873-1950 
Cardano, Geronimo, 1501-1576 
Cartan, Elie Joseph, 1869-1951 
Cartesius + Descartes 

Cauchy, Augustin Louis, 1789-1857 


56 


9 


The path of-the light ray is the solution of 
a minimal problem 

Mathematical models 

Moebius strip 

A closed surface of genus 1 

Pseudosphere 

Surface representing the modulus of the 
function w = exp (1/z) 


Cavalieri, Bonaventura, c. 1598-1647 

Cayley, Arthur, 1821-1895 

Ceva, Giovanni, 1647-1734 

Chebyshev, Pafnuti Lvovich, 1821-1894 

Clavius, Christoph, 1537-1612 

Cramer, Gabriel, 1704—1752 

Cusanus, Nicolaus, 1401-1464 

Dandelin, Pierre, 1794-1847 

Dedekind, Richard, 1831-1916 

de la Vallée-Poussin, Charles, 1966-1962 

Descartes, René, 1596-1650 

Diphantos of Alexandria, c. 250 A. D. 

Dirichlet, Peter Gustav Lejeune, 1&05—1859 

Direr, Albrecht, 1471-1528 

Eisenhart, Luther Pfahler, 1876-1965 

Enriques, Federigo, 1871-1946 

Eratosthenes of Kyrene, c. 276-194 B. C. 

Euclid of Alexandria, c. 450-380 B. C. 

Eudoxus, c. 408-355 B. C. 

Euler, Leonhard, 1707-1783 

Fermat, Pierre de, 1601-1665 

Ferrari, Ludovico, 1522-1565 

Ferro, Scipione del, c. 1465-1526 

Fibonacci t Leonardo of Pisa 

Fisher, Ronald Aylmer, 1890-1962 

Fourier, Jean Baptiste Joseph de, 
1768-1830 

Fraenkel, Abraham, 1891-1965 

Fredholm, Erik Ivar, 1866-1927 

Frege, Gottlob, 1848-1925 

Frobenius, Ferdinand Georg, 1849-1917 

Galilei, Galileo, 1564-1642 


10 Index of mathematicians 


Galois, Evariste, 1811-1832 

GauB, Carl Friedrich, 1777-1855 

Girard, Albert, 1595-1632 

Goldbach, Christian, 1690-1764 

Green, George, 1793-1841 

Gregory, James, 1638-1675 

Guldin, Paul, 1577-1643 

Gunter, Edmund, 1561-1626 

Hadamard, Jaques Salomon, 1865-1963 

Hamilton, Sir William Rowan, 1805-1865 

Hankel, Hermann, 1839-1874 

Herbrand, Jacques, 1908-1931 

Hermite, Charles, 1822-1901 

Heron of Alexandria, c. 75 A. D. 

Hesse, Ludwig Otto, 1811-1874 

Hilbert, David, 1862-1943 

Hippasos of Metapontum, c. 450 B. C. 

Hippocrates of Chios, c. 440. B. C. 

"Hospital, Guillaume Francois Antoine Mar- 
quis de, 1661-1704 

lHuilier, Simon, 1750-1840 

Huygens, Christiaan, 1629-1695 

Jacobi, Carl Gustav Jacob, 1804-1851 

Jordan, Marie Ennemond Camille, 1838-1922 

Kepler, Johannes, 1571-1630 

Klein, Felix, 1849-1925 

Kovalevski, Sonya, 1850-1891 

Kronecker, Leopold, 1823-1891 

Krull, Wolfgang Adolf Ludwig Helmuth, 
1899-1971 

Kummer, Ernst Eduard, 1810-1893 

Lagrange, Joseph Louis, 1736-1813 

Lambert, Johann Heinrich, 1728-1777 

Laplace, Pierre Simon de, 1749-1827 

Lasker, Emmanuel, 1868-1941 

Lebesgue, Henri Léon, 1875-1941 

Legendre, Adrien Marie, 1752-1833 

Leibniz, Gottfried Wilhelm, 1646-1716 

Leonardo da Vinci, 1452-1519 

Leonardo of Pisa, called Fibonacci, 
11802-1250? 

Lie, Sophus, 1842-1899 

Lindemann, Ferdinand von, 1852-1939 

Liouville, Joseph, 1809-1882 

Lipschitz, Rudolf, 1832-1903 

Lobachevskii, Nikolai Iwanowich, 1792-1856 

Lullus, Raimundus, Lull, Ramon, c. 1235-1315 

Machin, John, 1685-1751 

MacLaurin, Colin, 1698-1746 

Maupertuis, Pierre Louis Moreau de, 

1698-1759 

Menelaus of Alexandria, c. 98 A. D. 

Minkowski, Hermann, 1864—1909 

Mobius, August Ferdinand, 1790-1868 

Moivre, Abraham de, 1667-1754 

Monge, Gaspard, 1746-1818 

Morgan, Augustus de, 1806-1871 

Napier, Neper, John, 1550-1617 

Neumann, John von, 1903-1957 

Newton, Isaac, 1643-1727 

Noether, Emmy, 1882-1935 

Noether, Max, 1844-1921 

Oresme, Nicole, 1323?-1382 


Ostrogradskii, Michail Wassilyevich, 1801-1862 

Oughtred, William, 1574-1660 

Pacioli, Luca, 14459-1514 

Partridge, Seth, 1603-1686 

Pappus of Alexandria, 4. century 

Pascal, Blaise, 1623-1662 

Peano, Giuseppe, 1858-1932 

Pearson, Karl, 1857-1936 

Pell, John, 1610-1685 

Plato, 427-347? B. C. 

Plicker, Julius, 1801-1868 

Poincaré, Henri, 1854-1912 

Poisson, Siméon Denis, 1781-1840 

Poncelet, Jean Victor, 1788-1867 

Poseidonius, c. 135-51 B. C. 

Proclus, c. 410-485 

Pythagoras of Samos, c. 580-496 B. C. 

Quetelet, Lambert Adolphe Jacques, 1796-1874 

Recorde, Robert, 1510?-1558 

Regiomontanus, Johannes, 1436-1476 

Riemann, Bernhard, 1826—1866 

Ries, Adam, 1492-1559 

Rolle, Michel, 1652-1719 

Rudolff, Christoph, c. 1500-1545 

Ruffini, Paolo, 1765-1822 

Russell, Bertrand, 1872-1970 

Rytz, David, 1801-1868 

Saccheri, Girolamo, 1667-1733 

Schmidt, Erhard, 1876-1959 

Schwarz, Hermann Amandus, 1843-1921 

Segre, Corrado, 1863-1924 

Severi, Francesco, 1879-1961 

Simpson, Thomas, 1710-1761 

Staudt, Carl Georg Christian von, 1798-1867 

Steiner, Jakob, 1796-1863 

Stevin, Simon, 1548-1620 

Stifel, Michael, 1487-1567 

Stirling, James, 1696-1770 

Stokes, George Gabriel, 1819-1903 

Tartaglia, Niccolo, originally Fontana Niccolo, 
c. 1500-1557 

Taylor, Brook, 1685-1731 

Thales of Miletus, c. 624—547 B. C. 

Theaitetus, 4102-368 B. C. 

Theodoros von Cyrene, c. 390 B. C. 

Tschirnhaus, Ehrenfried Walter Graf von, 
1651-1708 

Vieta t Viéte 

Viéete, Francois, 1540-1603 

Vlacq, Adrien, c. 1600-1667 

Wallis, John, 1616-1703 

Waring, Edward, 1734-1798 

WeierstraB, Karl, 1815—1897 

Wessel, Caspar, 1745-1818 

Weyl, Hermann, 1885-1955 

Whitehead, Alfred North, 1861-1947 

Widmann, Johann, born 1460 

Wingate, Edmund, 1593-1656 

Wittich, Paul, 1555-1587 

Wronski, Josef Maria, 1775-1853 

Zenodoros, c. 180 B. C. 

Zenon of Elea, 490-430 B. C. 

Zermelo, Ernst, 1871-1953 


Introduction 


The great achievements of technology in all its forms, which deeply influence the life of every 
human being, have led to a widespread recognition of the importance of mathematics: everybody 
knows, or at least believes, that without mathematics these achievements in their entirety could not 
have come about. Interest in mathematics has therefore grown steadily, and with it the need for 
information about this science. 

Now in many respects mathematics is an exceptional science, in particular, as regards the presenta- 
tion of its problems and results. While in medicine, zoology, botany, geography and geology, or 
in languages, history, astronomy, a scholar, fully equipped with the knowledge of his time, can 
explain to a layman the majority of his problems and results, perhaps even his methods or the 
fundamental principles of his special interests, in such a way that he succeeds in conveying an 
impression of the contents of this field, in present-day chemistry and physics this is far more dif- 
ficult — and in mathematics well-nigh impossible. Not only has the volume of results grown phenom- 
enally, but the problems are so difficult to treat and lie so deep that even mathematicians can have 
no more than a superficial view of the whole of mathematics. 

One tries to counteract the fragmentation of mathematics into many special branches by extracting 
as far as possible from various domains common features, which sometimes do not lie at all close 
to the surface, and by creating from them a new and even more abstract theory: in just this way new 
links are forged between at first sight widely diverging directions. This process can be regarded as a 
repeated abstraction: whereas the basic disciplines such as algebra and geometry have their origin 
in abstractions from everyday experience, one arrives at such a unifying theory by further abstract- 
ions, for example, from algebra and geometry: and under certain circumstances such abstracting 
processes can be repeatedly piled on top of one another. Here ‘abstract’ has to be understood in 
the literal meaning of the word as ‘removing’, as leaving aside everything inessential for the 
context in question or for a particular purpose; for example, ignoring colour in geometric figures, 
which may very well play a role in ornaments. 

From all this it follows that it is quite impossible to give a layman even a glimpse of the whole 
of contemporary mathematics. Here a /ayman is not only one whose knowledge is limited to the 
normal contents of a school syllabus. Even a mathematician with a diploma or a B. Sc., even a 
teacher of mathematics, has to be regarded as a layman in many special branches. It is simply 
impossible to acquire specialized knowledge of all branches of mathematics in three or four years 
of study. Therefore this book cannot have the ambition of imparting knowledge in all special fields 
of mathematics -— restriction is essential. 

In its historical development mathematics first proceeded in quite a naive manner. It started out 
from the numbers 1, 2, 3, ... and from the intuitively obvious figures of geometry such as points, 
segments, lines, planes in space, angles, triangles, circles, etc.; gradually it ascended to more complex 
formations, with the realm of numbers and that of figures not developing as separate entities, but 
connected through the notion of measuring. It was in this development, progressing from the in- 
tuitively simple and obvious to more complicated problems, that mathematics was built up, for 
example, in Babylonia and Egypt; astonishing achievements were reached in astronomy, such as the 
prediction of lunar eclipses. But it was the Greeks who lifted mathematics to a completely new level 
of development when they felt compelled not always to forge ahead, but also to reflect: what is it 
that one does in pursuing mathematics? The result was that through them mathematics became a 
sience in the present-day sense. On the one hand, they recognized that a proof consists in reducing 
a mathematical proposition to other known facts by the simplest logical conclusions, supported 
and made convincing sufficiently often by evidence or experience. On the other hand, they realized 
that such a reduction process cannot go on indefinitely but only as far as certain simplest properties 
of numbers or figures, which appear secure by virtue of intuition or experience. 

In this way they compiled for the first time consciously a system of fundamental facts, for example, 
that there is precisely one straight line passing through two points, and they created the foundation 
of logic. Together these two features lead to a systematic build-up of geometry, rising from the simple 
to the complex. 

For a long time this Euclidean geometry, apart from a few minor supplements, remained the 
model of a science. However, no comparable attempt was made for about two thousand years to 


12 


treat algebra and later analysis in the same manner. The basic properties of the natural numbers 
were something obvious for the Greeks, but questions of divisibility and problems concerning prime 
numbers were of interest to them. They knew how to manipulate common fractions, but they did 
not pursue the idea of introducing negative numbers. However, in connection with a right-angled 
isosceles triangle they stumbled on the fact that fractions are insufficient to describe the ratios of 
all quantities: they noticed that in such a triangle the ratio of side to base cannot be represented 
by a fraction. But from this they did not by any means draw the conclusion that the domain of 
fractions ought to be extended in such a way that this ratio, and as far as possible all other geometric 
ratios, could be described numerically in terms of the new numbers of the more extensive domain. 
They did precisely the opposite: they geometrized their algebra. True, this led to a theory that is 
equivalent to our theory of real numbers; but the geometrization gave rise to such complications 
that Greek mathematics ground to a halt. 

Centuries later the practical needs of astronomers and mariners required urgently trigonometric 
calculations, which could only be mastered with the aid of tables of certain trigonometric func- 
tions. Since observational values could only be measured with limited accuracy, it was sufficient to 
give approximately the quantities to be calculated. This led gradually to the invention of terminating 
decimals, which proved much more suitable for practical computations than the common fractions. 
Most probably the conviction grew that the results would be the more exact the more decimal 
places used, and even that every preassigned accuracy can be achieved by using sufficiently many 
decimal places. In the last analysis this approach grasps the very essence of the real numbers; in- 
deed, mathematicians no longer shied away from talking of decimal fractions with infinitely many 
places. If this theory had been developed consistently, the result could have been an exact theory 
of the real numbers. 

An interesting example of fundamental significance shows how this notion in a slightly different 
form appears as early as in Archimedes’ work, when he tries to calculate the area of certain parts 
of the plane with curvilinear boundaries. First of all, in his famous exhaustion method he succeeded 
in calculating the area bounded by part of a parabola and one of its chords. It turned out that a 
certain ratio of areas of paramount importance was 1/3. But Archimedes did not succeed in finding 
a correspondingly simple result for the area of a circle. To solve the problem he would have had to 
calculate the number z. As we now know, he could not succeed, being only in possession of fract- 
ions; he had to be satisfied with proving that the number z lies between two fractions, namely 
31/, and 31°/,,. For this purpose he calculated, by a repeated application of Pythagoras’ theorem, 
the areas of the regular convex polygons with 96 sides inscribed in, and circumscribed to, the circle 
and gave approximate values for them. Clearly ARCHIMEDES was aware that by taking the number 
of sides and vertices sufficiently large he could include x within ever narrower limits and even 
calculate it with any prescribed accuracy. But this possibility of determining a number approximately 
with a prescribed accuracy by means of fractions is a characteristic feature of the real numbers. 

This feeling of familiarity with the nature of the real numbers became firmly established in the 
course of time on diverse occasions, for example, — long before the foundation of the differential 
and integral calculus — in the composition of logarithmic tables, in Descartes’ analytic geometry, 
where the points of a plane or of space are specified by coordinates, and then to a large degree in 
the development of the differential and integral calculus, which was started by LEIBNIZ and NEWTON 
and continued, as if intoxicated by the joy of discovery, by the BERNOULLIS, by EULER and FERMAT, 
by Caucny, Gauss and others. No one imagined that the foundation of the theory of the real 
numbers would require a further intensive study. 

However, questions of foundation played their part in two other branches, geometry and algebra. 
As already indicated, Euclid’s geometry takes as its starting point a system of very simple geometric 
propositions from which further theorems of geometry can be derived. These simple propositions, 
called axioms, represented an extract of the geometric knowledge of the time and were intuitively 
so clear that no one felt the need to prove them. An exception was the parallel axiom (or postulate). 
This states that to a given line and a given point not on the line there is one and only one line passing 
through the given point without intersecting the given line. Was it perhaps possible to remove this 
statement from the system of axioms by deriving it from the remaining axioms? — For 2000 years 
mathematicians wrestled in vain with the problem, until GAuss in Germany, LOBACHEVSKII in Russia, 
and B6tyalI in Hungary succeeded in showing that the parallel axiom is independent of the other 
axioms. The significance of this result only becomes clear in connection with other developments. 

In algebra the formula for the solution of quadratic equations can lead to the expression /—1, 
which at first sight is meaningless. But as long as one calculates with it just as with ordinary roots 
like /2, /3 or even x, the results invariably makes sense. This strengthened the belief in the right 
of citizenship of this formation /—1, for which the notation i had meanwhile been accepted. Nearly 
300 years elapsed before Gauss and others showed that what one had done until then can be inter- 
preted in a completely sensible manner as an extension of the domain of real numbers in which 
there exists a new number whose square is equal to —1. 

Even Gauss was so thoroughly familiar with real numbers that he had no scruples in using them 


13 


without justification. Only when certain difficulties emerged in the process of clarifying the concept 
of limit at the hands of CAucHY and other mathematicians of the time, did the real numbers become 
an object of serious thought. It was recognized that a theory of the real numbers can be founded, 
in fact in different ways, on a reduction to fractions. The latter, in turn, could be reduced to the 
natural numbers, and again it appeared that in the domain of natural numbers all their properties 
could be united in a few perfectly obvious fundamental facts, the Peano axioms. 

With this reduction to the natural numbers a basis was given for the theory of the real and complex 
numbers, and also for the whole of real and complex analysis and beyond, even for geometry; 
for in analytic geometry it is shown how to master the basic objects of geometry, above all the points, 
by means of their coordinates, which are real numbers. 

In this context another development should be mentioned, which started rather tentatively about 
150 years ago. It was common knowledge that some rules for the multiplication of numbers and 
some for the addition show a strong formal similarity. Similarly, quite simple formal laws were 
observed in other mathematical operations, for example, in carrying out several motions in succes- 
sion. But only very slowly did mathematicians proceed to the next logical step of extracting the 
common basic properties and of deriving from them new and ever deeper properties by purely 
logical processes. This field developed gradually to the present-day theory of groups, and again one 
sees, just as in Euclidean geometry, the emergence of an axiom system with all the subsequent 
developments. 

Nowadays large parts of mathematics, above all algebra, but to an ever increasing extent analysis 
and geometry, are built up axiomatically. The procedure is roughly as follows: given is a collection, 
usually called a set, of mathematical objects, the elements of the set, together with some system of 
axioms that describes the basic properties of these objects. Now the following tasks arise: first of 
all, to draw the most far-reaching conclusions from the axioms, in other words, to carry the theory 
of such a structure as far as possible; next, to gain a survey of all specific ways of realizing the axiom 
system in question. It can happen that essentially there is only one possibility of realization, or 
several, or perhaps even infinitely many; it is also possible that no such realization can be found, 
for example, when the given axioms contradict each other. If there are several models, that is, ways 
of realizing the axioms, then one searches for characteristic features by which the various possibilities 
can effectively be distinguished in finitely many steps. For some structures these tasks have been 
solved completely, for others we are still far away from a solution. This indicates, incidentally, 
how closely interwoven axiomatics and mathematical logic are. 

Even more imperative became the demand for an efficient mathematical logic when at the turn 
of the century contradictions arose in one of the new structural theories, the theory of sets. Sef 
theory is the simplest structural theory, inasmuch as it is concerned with completely arbitrary col- 
lections whose elements are not subject to any axioms, such as points, numbers, motions, functions, 
figures, but equally well men, stars, chairs or what have you. Since no structural assumptions are 
made, two such sets are to be regarded as equivalent or equipotent if they have equally many elements. 
In the case of finite sets the meaning of this is immediately clear to everyone; but it was a magnificent 
achievement to define even for infinite sets something like the number of its elements, the so-called 
power or cardinality. True, this fails to have some of the properties with which we are familiar when 
the number of elements of a finite set is involved. For example, in this sense there are just as many 
natural numbers as there are fractions, but not as many fractions as real numbers, and the set of 
points on a line has the same cardinality as that of the points in the plane. All these are things 
which in spite of their apparent lack of intuitiveness are entirely unobjectionable from the point of 
view of total mathematical rigour. Contradictions appeared, however, in the unrestrained formation 
of sets; for example, the concept of ‘set of all sets’ is contradictory in itself. Nevertheless, this 
was not a crisis in mathematics, as the phenomenon was sometimes called; on the contrary, mathe- 
maticians took occasion to reflect more thoroughly on what is involved in defining mathematical 
concepts. Indeed, a systematic mathematical logic was developed, and today one knows precisely 
how to avoid such contradictions. 

One might think that this utmost abstraction, in the form of axiomatization of the very general 
structural theories and of mathematical logic, could lead further and further away from down-to- 
earth applied mathematics. This is by no means the case; it was no accident that LEIBNIZ, who 
apart from his immediate creative mathematical work occupied himself with some fundamental 
questions of logic, has already constructed a workable calculating machine. 

The appearance of factory-made calculating machines, operated by hand or by a motor, did not 
give rise to important discussions of principles. But this state of affairs changed radically with the 
creation of electronic computing machines, by which the speed of calculation was increased drastically. 
True, these machines work on a simple black-white principle, because in each of their components 
current does or does not flow. Nevertheless, they can cope with calculations that otherwise would 
be practically impossible: they perform huge numbers of the simplest operations with an unimaginable 
speed and so can go through a complicated and protracted program in an acceptable time. Naturally, 
the duration of such a calculation depends on the skill that goes into the making of the program. 


14 


After some preliminary work that had been done before the invention of electronic computing 
machines it soon turned out that in programming certain regularities are observed, which also play 
a role in mathematical logic, for example, in the theory of algorithms. This once more demonstrated 
the practical advantages of certain purely mathematical investigations which had been carried out 
merely for theoretical needs — a truly classical example of the close natural relationship between 
pure and applied mathematics, in this case computing techniques. 

In this context it seems appropriate to draw attention to the difference between the theoretical 
and the practical solubility of a mathematical problem. Quite frequently in mathematics it is not 
individual, numerically given, problems that are discussed, but general problems depending on 
certain data, whose numerical values can be chosen in many, as a rule infinitely many, ways. A 
simple example: to determine the area of a triangle depending on the lengths of its three sides. 
There is a formula for this area that is valid for all triangles, although there are infinitely many 
possibilities for the length of each side. 

Such a problem is regarded as solved when a formula, an algorithm, can be given by means of 
which the solution can be calculated in each individual case. Here one postulates that the formula 
or the procedure leads to the numerical result in finitely many steps. When this is the case, a pure 
mathematician considers the problem as solved. Nevertheless, in practice the problem can still be 
insoluble if the number of necessary steps is finite, but for reasons of time or economy is too large. 
This can lead to new and interesting problems of pure mathematics: to find more effective proce- 
dures — unless one is satisfied with approximate solutions or one builds faster computers. 

An enormous step forward in this respect was the invention of the electronic computing machines. 
It had the consequence that new branches arose, above all in applied mathematics, branches which 
had not been developed previously, because it was clear from the outset that their main problems 
could not possibly be attacked and solved within a practically acceptable period. Two examples of 
problems soluble in principle are the games of ‘nine men’s morris’ and chess. They are soluble in 
principle, because by the rules there are only finitely many possible games. Nine men’s morris is 
also solved in practice, in that one can give to the first player exact instructions how to react to 
all possible moves of the opponent so as to win in every case. The same question, whether in chess 
the white player can always win, is still unsolved in spite of the finiteness of the problem; even if all 
electronic computers at present available in the whole world were used solely to solve the chess 
problem, a solution could not be reached: this would require computers working unimaginably 
faster than the present ones. 

The development of mathematics, which has been roughly sketched here, led from the simplest 
fundamental concepts of number, operation, figure, and measure to its present-day thoroughly 
axiomatized form of a wealth of highly abstract structures and to the modern computing automata 
whose possibilities are far from being exhausted. A comparison of this development with the table 
of contents of this book indicates many direct and indirect relationships. 

Thus, the material of the first part ‘Elementary mathematics’ agrees to a large extent with mathe- 
matics as it was developed from antiquity through the Middle Ages and before the foundation of 
the differential and integral calculus. Only here arithmetic, the theory of numbers, and geometry 
are not set forth side by side, but one after the other. We begin with the natural numbers, together 
with the rules for the elementary operations, just as they present themselves as perfectly obvious to 
a naive person. But the axiomatic build-up follows immediately, starting from the natural numbers 
and leading up to the complex numbers. 

Even for these simple concepts a notation is used that was unknown to the Greeks and whose 
absence was one reason for an extremely cumbersome and unwieldy presentation: the use of /etters 
for numbers. Today it is taken for granted in schools. Here the notation is admirably suited to the 
basic mathematical concepts, but it is so easy to handle that sometimes there is the danger of thought- 
less and mechanical manipulation of letters. This suggestive effect must be strongly opposed, espe- 
cially in schools: the primary thing is the mathematical idea, and the computational working details 
are secondary — not the other way round. On this theme Gauss wrote to SCHUMACHER in a letter 
of 1 September 1850: ‘It is a characteristic of modern mathematics ... that in our language of 
symbols and names we possess a lever by which the most complicated arguments are reduced to a 
certain mechanism ... How often is this lever handled just mechanically, although in most cases the 
authority to do so implies certain tacit assumptions. I postulate that in every application of the 
calculus, in every use of concepts, one should always remain conscious of the original conditions 
and should never regard results produced by the mechanism as mathematical property beyond the 
clearly permitted limits.’ 

Many tasks require unknown quantities to be determined from given quantities. As a rule, the 
use of letters enables us to state such tasks simply and lucidly. It then happens frequently that 
problems which at first sight appear totally distinct have one and the same form in the resulting 
equations or systems of equations. This points again to the parallelism between the mathematical 
formulation of problems and the abstraction that consists in disregarding the meaning of the given 
and the required quantities and leaving only the mathematical nucleus. 


15 


A characteristic feature of modern mathematics is functional thinking. This means that one is 
concerned with functional relationships, such as the dependence of certain quantities on certain 
others, for example, the area or the angles of a triangle on the lengths of its sides. We shall become 
acquainted with other examples of this kind of thinking in the analysis of the notion of a function. 

Elementary geometry deals with points, segments, angles, straight lines, triangles, quadrangles, 
circles, tetrahedra etc. in a plane or in space. An essential role is played here by the concept of number 
as developed previously, owing to the need for measuring the objects. Naturally, this must not lead 
to a neglect of pure geometrical thinking, especially in the solution of problems. One tries to solve 
geometric problems by purely geometric means, that is, by constructive drawings. How to treat 
problems in space by drawings in the plane is the topic of descriptive geometry. The most intimate 
fusion between geometry and calculation occurs in analytic geometry: by means of the concepts of 
coordinates geometric problems can be transformed into numerical problems: in this way geometry 
becomes accessible to the far-reaching methods of analysis. 

The rudiments of analysis itself are treated in the second main part ‘Steps towards higher mathe- 
matics’. Although the concept of limit is already used in elementary mathematics in an intuitive 
fashion, higher mathematics begins just with a rigourous theory of limits. This in turn is the basis, 
on the one hand, for the theory of infinite series of numbers and functions, on the other hand, for 
the notion of continuity of functions as well as for the differential and integral calculus, whose 
significance is fundamental not only for the entire framework of mathematics, but also for the 
applications in physics, technology etc. Many problems of geometry and physics present themselves 
in the form of differential equations, that is, in relations between a function and its derivatives. The 
theory, which has grown by now to a very large volume, can only be sketched here in its simplest 
parts. An attractive branch is differential geometry, an application of the differential and integral 
calculus to the theory of curves in a plane and in space and to surfaces in space. 

As we remarked above, the theoretical solution of a problem is frequently far removed from an 
immediate application of specific cases, because the necessary numerical calculations become too 
extensive. It is the task of graphical representations and of numerical methods to transform theoretical 
solutions into directly applicable ones. Probability theory and statistics also play an important part 
in applications. 

In the last main part ‘Brief reports on selected topics’ an attempt is made to give an insight 
into a number of research fields of contemporary mathematics. For the reasons stated at the begin- 
ning, a more detailed account of the individual problems is impossible, and domains that at present 
are still in a nascent stage or in the process of deep reorganization could not be included. The reader 
who wishes to acquaint himself more thoroughly with one branch or another would do well to 
refer to the specialized literature — and this applies equally well to the first two main parts. 


fo rept 


Hans Reichardt 


Authors and translators 


The authors of the ‘Kleine Enzyklopadie der Mathematik’ are: 


G. Berthold Dr. S. Oberlander 
Prof. O. Beyer Prof. M. Peschel 
Prof. L. Bittner Dr. G. Pietzsch 
Prof. H. Boseck Dr. B. Renschuch 
Dr. H. G. Bothe Prof. H. Sachs 
Dr. G. Czichowski Prof. H. Salié 

J. Dahnn H. Schlosser 

Dr. C. Frischmuth Dr. E. Schréder 
Dr. D. Gohde Dr. L. Stammler 
W. Gohler A. Steger 

Prof. L. Gorke Prof. R. Sulanke 
Dr. M. Hellwich Prof. H. Thiele 
Dr. H. Herre Dr. H. Thiele 
Prof. M. Herrmann Prof. W. Tutschke 
H. Kastner Dr H. Vahle 

G. Lisske Dr. L. W 

Dr. G. Lorenz T. L. Wagner 
Dr. G. Maess Prof. W. Walsch 
Dr. W. D. Miiller Dr. V. Winsch 
Dr. F. Neigenfind Dr. G. Wussing 
Prof. F. Nozi¢ka Prof. H. Wussing 


The present English version of the ‘Kleine Enzyklopadie der Mathematik’ was prepared under 
the editorship of Professor K. A. Hirsch and with the collaboration of 


Dr. O. Pretzel 

Dr. E. J. F. Primrose 
Professor G. E. H. Reuter 
Dr. A. Stefan 

Dr. A. M. Tropper 

Dr. A. Walker 


I. Elementary Mathematics 


1. Fundamental operations on rational numbers 


1.1. The natural numbersN........... 17 Calculations with common or vulgar 
Numbers and digits...........00005 17 JIACHONG cop inoue achewencwn eo 31 
Calculations with natural numbersN 20 Decimal fractions .......0ccccevee 33 
Elementary Number theory ........ 23 Computations with decimal fractions 35 

1.2. Peco ets 7 1.4. Proportionality and proportions ... 37 
Calculations with the integers Z.... 28 1.5. Working with numerical variables . 40 

1.3. The rational numbersQ.......... 30 Working with algebraic sums ...... 41 
Foundations ...... 0. ccc nee ee 30 Fractions with variables ........... 44 


1.1. The natural numbers N 


Numbers and digits 


What are natural numbers? Two kinds of activity made our ancestors face the necessity of oc- 
cupying themselves with numbers; this led to the development of cardinal and ordinal numbers. 

Cardinal numbers. Man had to compare various sets of things, for example, flints, dogs, hunting 
companions, in order to ascertain which set contains more elements (constituents, members). 
Today one does this, as a rule, by counting and comparing the quantities so obtained; this presumes 
an ability to count, that is, a knowledge of the numbers. But there is an easier way: if one wishes 
to find out, for example, whether men and horses are present in equal numbers, one simply places 
a rider on every horse. In other words: one sets up a matching, a correspondence, between men and 
horses. This matching may tally — then there are just as many horses as there are men, and one 
says: the sets are equipotent, — or some of one kind are left over; then there are more of this kind 
(Fig.). In laying a table one arranges a correspondence among sets of cups, saucers, spoons, etc. 
All sets between which such a matching of pairs can be established therefore have the corresponding 


number as a common property (Fig.). This is the way in which even today our children gain their 
knowledge of the cardinal numbers. 


1.1-1 Men and horses — without matching 1.1-2 Men and horses — with matching. 
One man is left over 


18 1, Fundamental operations on rational numbers 


1.1-3 Common number: three 


Abstraction has not progressed this far in all stages of civilization. There are primitive tribes who 
use distinct numerals when they refer to distinct objects. Two women is then something other than 
two arrows; here the abstraction of number from the other properties of the sets has not yet been 
achieved. 

Ordinal numbers. The second need consisted in creating order within one and the same set. For 
example, it had to be laid down according to some point of view — say, the height, the age or the 
bravery of the rider — who would ride first, second, ... at the hunt (Fig.). Something quite similar 
occurs when one counts through the elements of set; only, the order so obtained is, as a rule, without 
significance. In this way, there arise the ordinal numbers. 


1.1-4 Set of four hunters, 
unordered, and ordered by height 


Cardinal and ordinal numbers have developed in close interconnection and form the two aspects 
of the natural numbers; frequently the zero (or null) is, by convention, reckoned to belong to them. 


Numerals and number symbols. For the purpose of oral and written communication and of 
memorizing cardinal and ordinal numbers number words (numerals) and number symbols are re- 
quired, the latter particularly for abbreviation and ease of calculation (Fig.). The strong similarity 
between the words for corresponding cardinal and ordinal numbers in all languages or writings 
is a sign of their close connection. In English most ordinal numbers have the ending -th (four—the 
fourth; a hundred—the hundredth), in writing a stop is added (for example, NEWTON was born on 
25. 12. 1642). In the United States the month is placed before the day: 12/25, 1642. 

Because of the great similarity, in what follows it is sufficient to confine our attention to cardinal 
numbers; corresponding arguments apply to ordinal numbers. 


Xx Ml 
1 100 1000 
. D 
50 500 
number word: nine | 
number symbol: tTHIlll or IX ar 9 CCLAVI 1/65 
1.1-5 Three symbols for the 1.1-6 Tally sticks 1.1-7 Roman number symbols 


number word ‘nine’ 


_ Representations of numbers. The simplest representations of numbers occur in tally sticks (Fig.), 
pieces of wood scored across with notches to record the items. Frequently they were split into halves 
of which each party kept one. The method of strokes, by which Robinson CRUSOE counted days, 
is still in use, particularly for tedious countings. But very soon, when the numbers become larger, 
this representation loses its perspicuity: it can be restored by appropriate groupings. Something 
quite similar occurs when new words for numbers are formed or new symbols are invented: it would 
be most uneconomical to introduce a completely new word and a new symbol for every number. 
Instead one composes words and symbols for larger numbers from those of smaller ones, and these 
building bricks themselves have arisen by the combination of units or smaller groups. According 


1.1. The natural numbers N 19 


to the method of this grouping and the arrangement of the symbols one distinguishes between 
addition systems and position systems. 

Addition systems. The best known example for an addition system is the Roman method of writing 
numbers. Of the basic symbols ten each were combined to the next higher group; in between there 
are auxiliary symbols (Fig.). By the way, the origin of these symbols is not completely clear. Some 
of them, for example M (mille) for 1000, have been in use in this form only since the middle ages. 
The Romans wrote C|O for 1000. The essence of an addition system is that all number symbols are 
formed by juxtaposition of as few of these symbols as possible (in our case seven symbols, see 
Fig. 1.1-7). A rule prescribes that the symbol for the larger number always stands to the left of that 
for the smaller number. An exception to this rule is motivated by the endeavour to use as few sym- 
bols as possible. The number nine can be represented as VIIII (5 + 4) or IX (10 — 1). The latter 
writing is preferred. Therefore, if the symbol of a smaller number stands at the left, then the 
corresponding number has to be subtracted, not added. However, it is not permitted to place 
several basic symbols or an auxiliary symbol in front: MCMLIX for 1959; CML (not LM) for 950. 
An addition system has disadvantages: in general, the number symbols are very long and therefore 
lack in clarity; when the numbers grow (in the present case, beyond 10 000), one has to keep invent- 
ing new symbols to avoid representations of excessive length; written calculations in an addition 
system are exceedingly troublesome. 

Position systems. Our present-day position system goes back to the Hindu from whom it came to 
us by way of the Near East (Arabic digits). In this perfection it is a fairly late achievement in the 
historical development of representations of numbers. In the system ten individuals (Units U) are 
combined to a new group, a Ten T and again ten of these to a Hundred H etc. However, no new 
symbols are introduced for these groups of higher rank (as in the Roman system), but they are 
distinguished by their position within the entire numerical symbol. In the Roman symbol XXX for 
thirty each of the three letters has the same numerical value 10, and since it is an addition system, 
the total number is obtained by adding the three individual values. In the symbol 444 for four- 
hundred-and-forty-four the three digits also have the same numerical value four; but within the 
total symbol they stand in different places and therefore have different positional values; the right- 
most position indicates the units: 


321 means:3H+2T+1U, 
CCCXXI means: 100 + 100 + 100+ 10+ 10+ 1. 


Since the gathering occurs in groups of ten each, one talks of a decimal system (Latin, decem 10) 
or a decadic positional system (Greek, deka 10). Accordingly, the Roman number system is a decimal 
addition system. The number ten is called the base of the system. The positional values are the 
powers of ten, some with their own names such as 1 million for 10° = 1000000, 1 milliard for 10°, 
1 billion for 1012, 1 trillion for 1018. There follow i quadrillion, 1 quintillion, etc., each time with 
six more zeros. Formations such as 1 billiard for 101° are rarely used; in the U.S.A. and 
U.S.S.R., 109 is called 1 billion, 101? a trillion, and 101° a quadrillion. It is probable, but not 
certain, that the choice of ten as a base is connected with the number ten of our fingers. In old 
measuring units (one dozen, one gross) one finds traces of a vanished duodecimal system with the 
base 12; the French word quatre-vingt for eighty points to a (non-positional) system with base 20, 
and the word score for a group of 20 objects is still in frequent use. Our time measures (1 h = 60 min, 
1 min = 60s), as well as the division of the full angle into 360° recall the sexagesimal system (base 60) 
of the Babylonians. This system already showed clearly some features of a positional system. But 
the complete development of such a system was hampered by lack of the consistent use of a symbol 
for empty places, a zero. The introduction of zero is one of the greatest achievements of the Hindu 
(around 800 A. D.). 

Not only 10, 12, 20, or 60 are suitable as bases of a positional system. Every natural number 
b > 1 can serve as base, because then every natural number a has exactly one b-adic representation 
a=a,b" + a,_,b"1 +---+a,;b+ ag, in which the natural numbers a,;, i= 0,...,” satisfy 
0<a,;< b. The a; are called the digits of a. Every positional system requires exactly b distinct 
digits. 

The binary system. Of particular technical importance is the binary system, which is also called 
dyadic or dual system. In it the position values are the powers of the base 2, that is, 1, 2, 4, 8, 16, 32, 
64, 128, ... These position values are considerably closer to each other than those of the decimal 
system; therefore the number symbols become comparatively long. On the other hand, one only 
needs two digits: 0 and 1. For the binary unit the notation L is in frequent use: 


7=1°-441-241-1=1-274+1-2!41-2°=LLL, 
9=1-°84+0°440°241-1=1:23+0-27+0-2!4 1:2° = LOOL, 
22=1-:16+0°84+1°-441-24+0-1=1-2++0-2341:2?41-2!40-2° 
= LOLLO. 
This binary system is often used in digital computers. 


20 1. Fundamental operations on rational numbers 


Order of the natural numbers N. Every natural number has exactly one immediate successor; 
for example, 96 is successor of 95. This means that the sequence of natural numbers has no last 
member, it never breaks off. The number 0 is not a successor; every natural number other than 0 
has exactly one immediate predecessor; this means that the sequence of natural numbers has a 
beginning in its first member 0. 

For any two natural numbers 7, and 7,2 exactly one of the following relations holds: ny < np, 
that is, n, is smaller than n,, for example, 3 < 7, orm; = 2, that is n, is equal to n2, for example, 
5 = 5, or n, > nz, that is, nm, is greater than n,, for example, 8 > 6. 

If one wishes to express that a number 7, is at most as large as nz, one writes ny < nz, ny, Is less 
than or equal to n,. Accordingly, n, = nz, that is, nm, is greater than or equal to 72, means that n, 
is at least as large as n2; therefore both 4 < 19 and 11 < 11 are correct statements. 

These relations have a property called transitivity; for the relationship ‘smaller’ it takes the 
form: from m, < mz and nz < ng it follows that n, < n3. The relationship ‘larger’, respectively, 
‘smaller’, orders the natural numbers ‘linearly’. An illustration of this linear order is the number 
ray (Fig.). In it the natural numbers are represented by a set of isolated (discrete) points. The fact 
that 7, is less than m2 then means that the point on the number ray belonging to n, lies to the left 
of the point 72. 


1.1-8 Number ray 


1.1-9 Union of two sets ‘5 + 3 = 8’ 


Calculations with natural numbers N 


Addition and subtraction. Addition is the simplest operation on natural numbers, and subtraction 
is its inverse. They are the arithmetical operations of the first kind. 

Addition. Addition reflects the joining, the union of two sets (Fig.). The operation symbol is + 
(read plus). The addition can also be regarded as an abbreviated counting forward: 5+ 3 as 
5+ | +6-+ | - 7+ 1 -» 8. Thetwo numbers to be added are called summands, the result is 


their sum. 


The name sum is used in two meanings: 8 is the sum of 5 and 3; the expression 5 + 3 is a sum. 
The addition of two natural numbers can always be carried out, that is, two natural numbers always 
determine a third, their sum. Several laws hold for the addition of natural numbers. 

Commutative law. The order of the summands has no influence on the result; for example, 
5+3=3+4 5 = 8. Since this commutativity of the summands holds for all natural numbers, one 
writes briefly:a + b=b+ a. 

Here and in what follows, a and b are symbols for arbitrary natural numbers. 

Associative law. In the first instance, addition is defined for only two summands. If three numbers 
are to be added, then two of them have to be added first, and a new addition of two summands can 
be formed from this sum and the third number. Here the order of combining the numbers has no 
influence on the result. 

This law of the (sequence of ) combination also holds for all natural numbers. It means that brackets 
may be omitted: 5 + 3 + 4 = 12. Similarly, additions of more than three summands may be written 


without brackets. 


Examples: 1. 5 +- 4 
2.5 4 


Monotonic law. The relationship ‘smaller’ between two natural numbers is preserved when 
the same number is added to the two numbers; for example, from 3 < 4 it follows that 
3+ 7< 4+ 7. This law, too, is valid for all natural numbers. 


1.1. The natural numbers N 21 


Subtraction. The process opposite to adding, namely taking away or deducting, leads to this 
arithmetical operation. Its symbol is —- (read minus). Subtraction can also be interpreted as an 
abbreviated counting backwards, for example, 7 — 3 as 7 +6 } +5 ] -+ 4(Fig.). 
From addition one comes to subtraction if for a given sum one asks for one of these summands, 
for example, 4— x 7; & 7 — 4. Accordingly subtraction is the inverse of addition. The 
number from which the other number is to be subtracted is called the minuend; the number to be 
subtracted is called the subtrahend and the result is the difference. 


Pe a 


§+3=8 
01239 6 5 678 8 
5-jJ=7 
11-10 7-—3=4 1.1-11 Operations on the number ray 


Like the word ‘sum’, so also ‘difference’ is used in two meanings: the difference of 7 and 3 is 4; 
the expression 7 — 3 is a difference. 

In contrast to addition, the subtraction of two natural numbers cannot always be carried out; 
for example, the problem 2 — 9 does not have a natural number as a solution. The condition for 
solubility is: the minuend must not be smaller than the subtrahend. 

Operations of the first kind on the number ray. Addition and subtraction of natural numbers can 
be illustrated on the number ray as addition and subtraction of segments (Fig.). 

Written addition. The summands are written one under the other, so that equal position values 
stand in the same column. The addition begins with the units, in any order on account of the com- 
mutative law, and then successively from right to left towards higher position values. If in one 
column the sum exceeds the next position unit, the corresponding amount is carried: 


Example: th H TU Th H T U Addition of more than two 
3 62 7 362 7 3 6 2 summands follows the same 
; 6 8 4or l 6 8 4writtenas+1 6 8 4 pattern. 
| 4 ly l 9 0 4 6 
1 0 9 | O14 6 74627 
71434 
Written subtraction. Subtractions can be performed in two slightly different : 
ways: (t) taking away: 7 take 3 gives 4; (s) supplementing: from 3 to 7 is 4. Examples: 
Accordingly two methods of written subtraction are in common use, (t) and 6311 
(s). In both methods the subtrahend is written below the minuend (units under — 768 
units etc.) and a start is made with the units. The following example illustrates — 229 
the difference in the case of the Tens. —1046 
Method (t): Take 9 from 2 does not go; one H is dissolved into ten T, and 4268 
now: take 9 from 12 gives 3. Next, instead of 6H there are only 5H left and 
take 1H from 5 H gives 4H. 70003 
Method (s): From 9 to 2 does not go. From 9 to 12 is 3. The disso- — 11628 
Ived H is not taken away from the minuend but is added to the subtrahend. 58375 


This gives the same result from 2H to 6H is 4H. 

The (s)-method is more lucid, for example, if the problem requires several dissolutions of 
position values because the minuend contains several consecutive zeros. Also it permits to carry 
out the subtraction of several subtrahends in a single step. In the example above the mental process 
at the units is: 6 + 9+ 8 = 23; from 23 to 31 is 8. The three dissolved Tens are added to the 
subtrahend; for the Tens the calculation is 3 + 4 + 2 + 6 = 15, from 15 to 21 is 6, etc. 


Multiplication and division. Multiplication and division are the arithmetical operations of the se- 
cond kind. 


22 1. Fundamental operations on rational numbers 


Multiplication. Multiplication can be arrived at in various ways, for 
example, by addition of several equal summands, 12 + 12 + 12 = 3x 12 
= 36 (Fig.). The operational symbol is - (read times) or a lying cross x. 


1.1-12 
12 + 12+ 12 
| = 3:12 = 36 


Since multiplicand and multiplicator can be interchanged, they are both called the factors. Again, the 
word product is used in two meanings. 36 is the product of 3 and 12; the expression 3 times 12 is a 
product. The multiplication of two numbers can always be carried out, that is, two natural numbers 
always determine a third, their product. For all natural numbersa: a:0 = 0:a=Oanda:1=1-:a=a. 

Commutative law.3-4=41+4+4=12and4°3=34+3+43+4+3 = 12, hence 4:3 =3-°4. 
The factors of a product can be interchanged without alterating the result. This is true for all natural 
numbers as factors: 


Associative law. If three numbers are to be multiplied, first two of them are multiplied and then 
the product is multiplied by the third. Here the order of combining the factors has no influence on 
the result. For example, 3-4-7 = (3:4)°7=12:7= 84; 3°4°:7=3:(4:7) =3: 28 = 84. 
This law also holds for all natural numbers. Therefore brackets may be omitted: 3-4-7 = 84. For 
more than three factors one proceeds similarly. 

Monotonic law. 3 < 4 leads to 3:8 < 4:8 but 3:0 = 4-0. This is true for all natural num- 
bers a, b, c: 


Division. Two distinct everyday problems lead to the other basic arithmetical operation of the 
second kind, division: 
(i) Sharing: twelve pears are to be shared equally among four persons; each receives three pears 


(Fig.); 
(ii) Being contained: how many times is 4 cm contained in 12 cm? — 3 times (Fig.). 


1.1-13 Sharing: 12 pears are divided into 4 equal parts 


1.1-14 Being contained 


46m 4cm 4cm 


Mathematically, division is arrived at as the inverse operation to multiplication: a product and 
one factor are given and the other factor is required; both 3: © =15 and *X% -3 = 15 lead to 
zx = 15:3. 


Since the factors can be interchanged, the 
two questions corresponding to sharing and 
to being contained lead to the same division 
problem. The operational symbol is the colon 


(read divided by). Again, the word quotient is used in two meanings: the quotient of 15 by 3 is 5; 
the expression 15: 3 is a quotient. 

Feasibility of division. The division of two natural numbers cannot always be performed within 
the domain of the natural numbers; for example, there is no natural number 7 for which 3n = 17, 
for 3-5 = 15 and 3 - 6 = 18, hence 17 is not divisible by 3. The use of the equality sign in writing 


1.1. The natural numbers N 23 


17: 3 = 5 remainder 2 is inexact; the expression 17 = 3 - 5 + 2 (dividend equals divisor times quo- 
tient plus remainder) is unobjectionable. Division by zero is impossible; for 5:0 = n would mean 
that 2-0 = 5 but for every 7 the product is 0, never 5..Even for the dividend 0 the division by zero 
is impossible, because the problem does not have a unique result. One could claim that 0:0 = 17 
because 0 - 17 = 0, but also 0: 0 = 193 because 0 - 193 = 0. 


Division by zero cannot be performed, 


Sequence of arithmetical operations. If in a problem operations of different kind occur, the sequence 
of performing them can influence the result: 7-5 + 3 leads to 7- 8 = 56 if addition is first carried 
out, but to 35 + 3 = 38 if multiplication is first carried out. The situation is similar for subtractions 
or divisions. Therefore the sequence of the operations must be agreed upon: 


The operation of higher kind is performed first. 


If in a given case the operations are to be performed in a different sequence, then brackets have to 
be introduced. The contents of each bracket are treated first: 


(12 + 96): 3 — 8- (5 — 2) = 108: 3 — 8-3 = 36 — 24 = 12. 


Distributive law. This law of distribution expresses a connection between arithmetical operations 
of different kinds; for example, 5- (4+ 3) = 5-7 = 35, but also 5-4+ 5-3 = 20+ 15 = 35; 
therefore 5- (4+ 3) = 5-4-4 5:3. This way in which for a multiplication of a sum the other 
factor is distributed over the summands is the same for all natural numbers a, ), c: 

From the distributive law one derives the relations - 

(a— b)-c=a:‘:c—b-c; (a+ b):c=a:c+ bic for c+0; 
(a— b):c=a:c—bic for c+0. For natural numbers a, b, c 
these equations only have a meaning if the subtraction a — b 
and the division a: c and b: c can be performed. 

Written multiplication. Written multiplication utilizes the distributive law, s SO ‘that, a aaeeie 
of the multiplication table up to 9 - 9 is sufficient. In principle the process is as follows: 


2356 - 473 = 2356 - (400 + 70 + 3) 2356 - 473 2356 - 473 
= 2356 - 4(00) + 2356 - 7(0) + 2356-3 9424. or 7068 
= 9424(00) + 16492(0) + 7068 = 1114388. 16492 16492 
ar ss ise : 7068 9424 
1114388 1114388 


In the second line the multiplication of the first factor 2356 is carried out by splitting this factor 
into its Units, Tens etc. The final addition of the partial products is done in writing. Instead of at- 
taching zeros to the partial products belonging to higher position values of the multiplicand, one 
shifts the number appropriately. 

Written division. In the written division process the dividend is split into Units, Tens, Hundreds; 
for example: 86: 2 = (80 + 6):2= 80:2+ 6:2 = 40+ 3 = 43. Since division is the inverse 
process to multiplication, the product of quotient and divisor must yield the dividend, and this 
must hold equally for the partial quotients. This leads to the following division scheme: 


487 487 
23) 11208 Or more briefly, by performing the 23) 11208 
92 individual subtractions mentally: 0 
200 a a 168 
184 The quotient is 487 and the remainder is 7. a 
168 
161 
7 
Elementary number theory 


Divisibility. 12: 4 = 3, that is, the number 12 is divisible by 4; but 15 is not divisible by 4. Hence 4 
is called a divisor of 12 or 4 divides 12 (in symbols 4 | 12); 4 is not a divisor of 15 (4 + 15). In general, 
a natural number a is said to be divisible by another b if there exists a natural number 7 such that 
a=n-b; bas well as n are then called divisors of a. On the other hand, a is called a multiple of 5 
and of n. The number 0 is divisible by all numbers a + 0 and is a multiple of every number. Every 
number a + 0 is divisible by 1 and by itself; these divisors are called improper. 


24 1. Fundamental operations on rational numbers 


Prime numbers. Prime numbers are numbers that have only improper divisors; for example, 5 is 
only divisible by 1 and 5, 13 only by 1 and 13; hence 5 and 13 are prime numbers. The number 1 
itself is not counted among the prime numbers so that the sequence begins with 2. 


120 = 4 - 30 
) 4=2:2 30=2:°15 
| 
Factorization. Every natural number is either itself 15=3:-5 
a prime number or can be written as a product of a he F 5. ‘ 


prime numbers, can be split into prime factors: 
The same decomposition is obtained by starting out from, say, 120 = 10- 12. By means of Euclid’s 
algorithm it can be proved that the factorization into primes is unique apart from the order, in 
other words, apart from the order there is only one way of splitting a natural number into prime 
factors (see the derivation at the end of this chapter). The statement of the theorem would be in- 
correct if 1 were counted among the primes. By the use of powers the prime factorization of natural 
numbers can be written down more conveniently, for example, 1008 = 2:2:2-2:°3-3-7 
= 2* + 3°: 7, 

Sieve of Eratosthenes. ERATOSTHENES of Kyrene (approx. 276-194 B.C.) indicated the following 
method of obtaining all the primes in a segment of the natural numbers: delete after 2 every second 
number (every number divisible by 2), then after 3 every third number (every number divisible by 3), 
then after 5 every fifth number (every number divisible by 5) etc. The remaining numbers of this 
segment are primes. As one can see in the following table, up to 100 only 4 deletions are required, 
the last one for all numbers divisible by 7. The reason is that 7 - 7 = 49 is less than 100, but already 
11-11 = 121 > 100; and 11 is the first non-deleted number after 7, hence the next prime. 


4 et 
Te ae a » [| 


_s2 | s3 | sa | ss | 56 [ism] se | 59 | 0 
‘fm | i 


Ist 2nd 3rd 4th deletion 


If one wishes to find out whether or not a given number, say 1303, is a prime, one need not carry 
the sieve method right up 1303. It is sufficient to check whether 1303 is divisible by prime numbers p 
for which p? < 1303. The reason is that if 1303 can be factored at all, 1303 = m-n, then the square 
of one factor is at most 1303, that of the other at least 1303. For 1303 the division has to be tried 
only for the primes p = 2, 3, 5, ..., 31 because 37? is 1369 > 1303. In fact, it turns that 1303 is a 
prime. 

Endlessness of the sequence of primes. Already Euc.ip (approx. 300 B. C.) raised the question 
whether the sequence of prime numbers breaks off or whether there are infinitely many prime num- 
bers. He proved indirectly that there cannot be a largest prime. Suppose that there is a largest prime P; 
then one forms the natural number N = 2:3-5-7:11--- P+ 1, the product of all primes up to 
and including P increased by 1. This number JN is not divisible by any of the prime numbers up 
to P, because upon each division it leaves the remainder 1. Hence it is either itself a prime or it 
has prime divisors that do not occur in the sequence 2, 3, .... P. Both contradict the assumption 
that P is the largest prime — hence the sequence of primes is infinite. The Appendix contains a table 
of the primes between 1 and 1000 and of their natural logarithms. The largest prime number known 
at present is 219937 — 1; it has 6002 digits. An unsolved problem is whether there exist infinitely 
many prime twins, that is, whether or not the sequence of pairs of consecutive odd numbers that 
are both primes, like [5; 7], [59; 61], [641; 643] or [1451; 1453], breaks off. 


1.1. The natural numbers N 25 


Common divisors and multiples. Greatest common divisor. If t is a divisor of a, then the factorization 
of ¢ can only contain primes occurring in the factorization of a, and at most to the exponent in the 
factorization of a; for example, 12| 60; 12 = 27-3; 60 = 22: 3-5. If ¢ is a common divisor of a 
and 5, then ¢ can only contain prime factors occurring in a and b, and at most to the smaller of the 
powers in a or b; for example, 12 is a common divisor of 48 and 360; from the prime factorizations 
12 = 2? - 3, 48 = 2*- 3, and 360 = 23: 37-5 one sees that 48 and 360 have several common di- 
visors: 1, 2, 3, 4, 6, 8, 12, 24. Of these 24 = 23 - 3 is the greatest. One says, 24 is the greatest com- 
mon divisor (gcd) of 48 and 360; gcd (a, 5) is the greatest among all numbers dividing both a and 
b. Every common divisor of a and b divides the greatest com- 


mon divisor of a and 5; for this is the product of all prime factors Example: 

occurring both in a and J, and exactly to the smaller of the relevant 1260 = 22+ 32-§-7 
powers. This is the basis for a method of finding the gcd, which 3024 — 2*- 33-7 

is equally applicable for several numbers, as the example shows. 5544 — 23-32-7-4] 

If two numbers a and b have no common divisor (except 1), so eed. 22 32-7 = 252 


that gcd (a, b) = 1, than a and 5b are called coprime or relatively 
prime. 

Euclid’s algorithm. For larger numbers the decomposition into prime factors is frequently very 
tedious, because it has to be done by trial and error; for example, 23 613 864 709 is the product of 
the two primes 112 843 and 209 263. If one wishes to determine the greatest common divisor of such 
numbers, it is appropriate to use a method that avoids the prime factorization — Euclid’s algorithm. 
Without proof it is illustrated in the example of the numbers 53 667 and 25 527: 


53 667 = 25 527:2+ 2613, For the case of coprime numbers one obtains 
25527= 2613-9-+ 2010, 87 = 41:2+45, 
2613 = 2010-1-+ 603, 41= 5-8+1, 
2010= 603:3+ £201, 5= 1-540. 
603 = = 201-3 + 0. Hence gcd (87, 41) = 1, the numbers 87 and 41 are 
Hence gcd (53 667, 25 527) = 201. coprime. 


If one wishes to determine the gcd of more than 2 numbers by Euclid’s algorithm, say of a, b 
and c, one can proceed step by step: one determines first gcd (a, b) = d, and then gcd (a, b, c) 
= gcd (d, c). 

Least common multiple. 60 is a common multiple of 6 and of 15, because 60 is a multiple of 6 as 
well as of 15. There are other, in fact infinitely many, common multiples of 6 and 15. For if a num- 
ber m is a multiple of a and 5b, then all multiples of m are also common multiples of a and b. The 
common multiples of 6 and 15 are 30, 60, 90, 120, ..., and among them 30 is the smallest; one says: 
30 is the least common multiple (lcm) of 6 and 15; it divides every other common multiple. If 
m = lcm (a: 5), then m must contain all prime factors occurring in the decomposition of a or of 5, 
and each to the highest occurring power. This fact makes it possible to determine the lcm according 
to the following scheme for three numbers: 


For larger numbers this method is again unsuitable on ac- Example: 

count of the prime factorization. An expedient is first to deter- 40 = 2°: 5 

mine the gcd by Euclid’s algorithm and then to utilize the 36 = 2? - 3? 

relation Icm (a, b)- gcd (a, b) = a: b. But for more than two 126 = 2+3?-7 

numbers there is no such simple relationship. liom 22° 325° 7 = 2520 


Rules of divisibility. The number 84 is divisible by 4 and by 3, 
therefore also divisible by 4-3 = 12. This conclusion is only permitted if the two divisors are 
relatively prime. In general: 


If a is divisible by m and n and if gcd (m, n) = 1, then a is also divisible by m- n. 


The determination of divisors and, if possible, the immediate recognition of divisibility by certain 
numbers is advantageous not only for the prime factorization, but above all also for cancellation 
in fractions. The relevant rules utilize simple laws of decimal writing; for example, for divisibility 
by 2 or by 5, multiples of 10 need not be taken into account because 10 is divisible by 2 and by 5. 
Similarly for multiples of 100 in respect of divisibility by 4 and 25, finally for multiples of 1000 in 
respect of divisibility by 8 and 125. All the powers of 10, that is, 10, 100, 1000 etc., leave the remainder 
1 on division by 3 or by 9. From the rules for the calculation with remainders it follows that, for 
example, 600 = 6-100 then leaves the remainder 6:1 = 6 and for 230 = 2: 100+ 3-10 the 
remainder is 2-1 -+ 3-1 = 5. With respect to the divisors 3 or 9 the cross sum of every number 
has the same remainder as the number itself. Here the cross sum is defined as the sum of all the 
digits; 7309 has the cross sum 7 + 3 + 0 + 9 = 19 and is not divisible by 3 or 9. 

All the even powers of 10, that is, 100, 10000, 1000000, etc., leave the remainder 1 on division 
by 11, and all the odd powers (10, 1000, 100000 etc.) leave the remainder 10 or 10 — 11 = —1. 


26 1. Fundamental operations on rational numbers 


Here the alternating cross sum has the same remainder as the number itself. How to form the alter- 

nating cross sum is illustrated by an example. 
Example:8 5 9 7 6 St+e+ e= @ 

s+7=- &@ penne cross wm s 

Therefore 85 976 is divisible by 11. . 


A number is divisible by 

2 if the last digit is divisible by 2; 
4 if the last two digits represent a number divisible by 4; 
8 if the last three digits represent a number divisible by 8; 
5 if the last digit is divisible by 5, that is, 5 or 0; 

25 if the last two digits represent a number divisible by 25; 
3 if its cross sum is divisible by 3; 
9 if its cross sum is divisible by 9; 

11 if its alternating cross sum is divisible by 11. 


Tests for accuracy. Calculations with remainders. To express the fact that a and 5 leave the same 
remainder r on division by d one writes a = b (mod d) (read a congruent to b modulo d), for example, 
17 = 42 (mod 5). For the remainder r one then has a=r(modd) and b=r(mod 4d). Thus, 
17 = 2 (mod 5) and 42 = 2 (mod 5). The fact that a is divisible by d can also be written as 
a= 0(mod d). The following rules hold: 


Suppose that a; = b, (mod d) Example: 22= 4(mod6) 
and a2 = b, (mod a) ISs = | 3 (mod 6) 
Then: a; + a2 = b, + bz (mod d) od 6 
Qa, —a,=b, — by (mod d) j= 1 (mod 6) 
a, * a,=b5, - b2(modd) 330 = 12 (mod 6) = 0 (mod 6) 


The examples also show how to reduce numbers to the simple remainder system 0, 1, 2, ....d — 1 
(here 0, 1, ..., 5) by adding or subtracting d suitably often. The rules for the calculation with re- 
mainders are applied in checking calculations by replacing numbers by their remainders modulo d 
rather than repeating the calculations with the numbers themselves; since the remainders are so 
easily calculated, one usually chooses d = 9, frequently also d= 11. If an inconsistency shows up, one 
is certain that there has been a calculating error. But when the test gives a consistent result, 
there is no guarantee that the calculations are correct: the error may be a multiple of d; therefore 
a test with d = 2 is of little value. 

The nine test. Every number on division by 9 leaves the same remainder as its cross sum. This is 
a generalization of the rule for divisibility by 9. This makes it easy to carry out the nine test for the 
basic arithmetical operations: 


The remainder on division by nine of a sum (a difference, a product) is equal to the sum (the difference, 
the product) of the individual remainders. 


Example 1: Problem ——® en sum — ag Addition: Reduce to the simple re- 
412 mainder system. The. pacers can 

3 p> 22 2 nes only be wrong by a multiple of 9, for 

+4 os - example, owing to the interchange of 


9098 ———> pate = two digits. 


Example 2: room —. cross sum — > — Subtraction: Here the reduction to 

21 the simple remainder system 1s impor- 

= 986 co, a 2! a aH tant if the subtraction of the residues 
tO cee 89 an Sekai aan 4 leads to a negative result. 


Example 3: Problem ——» en sum —e remainder Multiplication: Here the Seiad of 
7 the product does not agree with the pro- 
>. cs, > 14 => 5 . duct of the remainders, hence the product 

oe SE — al aa ap cannot be correct; correct is 235 694. 


The eleven test. Just as the cross sum yields the remainder of a number on division by 9, so the 
alternating cross sum leads to the remainder of division by 11; attention has to be paid to what 


1.2. The integers Z 27 


Example 4: Problem alternating cross sum» remainder (mod 11) 
2 468 12— 8= 4 4 
+4293 Sew 65 —13=-—8 * 5 ae 
6761. ——. 8-—12=—4=7—» 7 


places contribute to the minuend and what to the subtrahend of the alternating cross sum. In the 
example the sum can only be wrong by a multiple of 11. If one uses the nine and the eleven test for 
the same problem, one obtains information on the correctness of the calculation to within multiples 
of 99. 


1.2. The integers Z 


Foundations 


Why integers? — There are situations in everyday life in which the natural numbers are insufficient 
to characterize certain quantities, because two opposing tendencies, two opposite directions, are 
possible for them; for example, the statement that a temperature is 23°C is incomplete; it has to 
be added whether it is measured above or below freezing point (Fig.). An amount of £ 100 is always 
the same. But if this sum is mentioned in connection with the property of a person, then it is important 
to know whether it is a credit in the savings account or a loan from the bank. For the elevation of 
a place it is essential whether it lies on a hill 895’ above or in a deep depression 895’ below sea level. 
To characterize these opposing tendencies the relevant numbers are provided with a sign, for example, 
+23°C and —23°C or +895’ or —895’, in chronology also —300 and +300 for events before and 
after the beginning of our era. A point of reference must exist here from which the measurements 
are taken. As a rule, it is laid down from a practical point of view, but in principle it is arbitrary; 
for example, there are scales of temperature (Fahrenheit) with a different zero-point. In cases when 
the variation of the quantities is in one direction only, then for a suitable point of reference the sign 
can be omitted (absolute temperature scale of KELVIN). Theoretically, elevations on the earth can 
be measured from the centre of the earth. The positive numbers +1, +2, +3,... and the negative 
numbers —1, —2, —3,..., which are obtained when direction is taken into account, together with 
zero (strictly speaking +0), are called the integers. 

Feasibility of subtraction. In mathematics the introduction of the integers is necessary so that 
subtraction, the inverse operation to addition, can always be carried out; for example, the subtrac- 
tion 7 — 11 has no solution in natural numbers. One says: the equation x + 11 = 7 is not soluble 
in the domain of the natural numbers. 


The integers form a number domain in which every 


ls subtraction problem has a solution, 
MM 2 
: - J Ce a ee 
‘| -6 -4 -2 ae +6 +8 
FES : * , é 0 +6 | 
23 °C i \\ teat 1.2-2 Number line and the numbers (+4) and (—4) 


opposite to one another 


he 


Temperature 23°C 


Opposite numbers. Just as the number ray illustrates the natural numbers, so the number line 
serves to illustrate the integers (Fig.). The non-negative integers correspond to the natural numbers. 
On the number line there is, for every integer other than 0, exactly one having the same distance 
from zero, but lying on the other side. Two such numbers which differ only by their sign are called 
opposite, for example, —4 and +4. The formation of the opposite number is expressed by a minus 
sign as prefix, so that —(—4) = +4 and —(+4) = —4. For the zero point one sets —0 = 0. 

Absolute value. Two opposite numbers on the number line having the same distance from zero 
are said to have the same absolute value. The absolute value is defined as the non-negative one 
of the two numbers: 


la| = fora>0O and |a|=—a for a<0. 
_ Order. Of two different integers the smaller one is that which lies further to the left on the number 
line. For any two integers n, and n, there always holds exactly one of the three relations nm, < nz, 
Or m = nz, OF ny > N2, for example, —3 < +2, +5< +7, +8 = +8, —1 > —7, +3 > —5. 
Every integer has exactly one immediate predecessor and exactly one immediate successor, that is, 
the sequence of integers contains neither a smallest nor a largest, neither a first nor a last, number. 


28 1. Fundamental operations on rational numbers 


Calculations with the integers Z 


Arithmetical operations of the first kind. In order to distinguish between the computational or 
operational symbols + and — and the signs, which have the same outward appearance, one encloses 
the complete number symbols (with sign) in brackets (Fig.). The minus sign — has the additional 
function of indicating opposite numbers. 

To define the addition of integers ~-7 ~-6 ~5 ~4 ~3 ~-2 ~I O 7} 2 -3 ~h -5 -6 ~7 
one is guided by the addition of natural ! —— Wt _| 
numbers, that is, (+3) + (44) = +7 


because 3 + 4 = 7. Te ee ae ; 
If the two summands have the same ' } 4 i = 
sign, then the sum of the natural num- ; 2 £ = F 
bers corresponding to their absolute 
values is provided with the sign of the 2 3 ie oo | eee 
summands. = - \ , 
' 
operational aymbot _ E a ae ee. 
j= f+) SS 
1.2-4 Sign and (+6) ¢+/ 2) = ié | fi 
operational symbol { ' i } — 


In (—3) + (—2) = —5 the sum 
3 + 2 = 5 receives the negative sign. 
If the two summands have different 
signs, then the difference of the natural 
numbers corresponding to their abso- 
lute values is provided with the sign of 
the summand with the larger absolute 
value. ft 


1.2-3 Addition on the number line 


In (+6) + (—2) = +4 the positive summand (+6) is larger than the absolute value +-2 of the 
negative summand (—2); in (—7) + (+3) = —4 the absolute value +7 of the negative summand 
is larger than the positive summand, and in (+4) + (—6) = —2 the absolute value +6 of the 
negative summand is greater than the positive summand (Fig.). 

Examples such as (+4) + (—6) = —2 and (—6) + (+4) = —2 point to the validity of the 
commutative law. The other /aws of addition also hold and are stated for arbitrary integers. 


The subtraction of integers must be defined as the converse to addition; (—7) — (+3) = x must 
have the same meaning as x + (+3) = —7. But in accordance with the definition of addition 
(—10) + (+3) = —7, consequently x = —10, that is, (—7) — (+3) = —10. 

On the other hand, (—7) + (—3) = —10. Hence for every subtraction: 


An integer is subtracted by adding the number opposite to it. 


Therefore subtraction can be carried out without restriction in the domain of the integers, because 

this is so for addition, and every integer has its opposite number; for example: 
(+28) — (—16) = (+28) + (+16) = +44. 

Algebraic sums. Since among integers every subtraction can be replaced by a corresponding 
addition, one defines as algebraic sums expressions in which the terms are combined only by opera- 
tions of the first kind. If they contain more than two summands, then the computation is performed 
conveniently as follows: 

(+15) — (427) + (—1D) — (—9)4+ (431) transfor 
= (+15) + (—27) + (1D + (49) + (431) | : 
= (+15) + (49) + (4:31) + (—27) + (1) 
= (+55) + (—38) 
= +17. 


1.2. The integers Z 29 


Arithmetical operations of the second kind. To define the multiplication of integers one is also 
guided by the multiplication of natural numbers and then proceeds step by step. 


If both factors are positive, their product is the positive number corresponding to the product 
of the corresponding natural numbers. 


The product (+4) : (—7) is interpreted by analogy to 4-7 = 7+ 7+ 7+ 7 as repeated addition 
of equal summands and is determined by (+4) -(—7) = (—7) + (—7) + (—7) + (-7) = —28. 
If the multiplicator is negative, one agrees that the commutative law remains valid: (—7) - (+4) 


= (+4): (—7). 


The product of two factors of opposite sign is negative, and its absolute value is the product of 
the absolute values of the factors. 


(3) - (—7) = *21 (5) - (+8) =™-40 
, ' f { } ; 
34 Fem 2) 5° 8s 40 
ee Fa EP 
A comparison of the stipulations made so far shows that the sign of a product changes when 
that of one of the factors changes. Therefore one agrees to set: (—4)- (—7) = +28. 
The product of two negative factors is positive, and its absolute value equal to the product of 
the absolute values of the factors. 


Summary of multiplication: 
If two integers have equal signs, their product is positive, otherwise negative; the absolute value of 
the product is equal to the product of the absolute values. 


| Examples: 
I. (—13) + (+-5) = —65. 2. (—8) « (—12) = + 96. 
3, (+3) - (—4) « (—9) = (+3) - (+36) = +108. 
4. (—3)* = (—3) - (—3) - (—3) - (—3) = +81. 


Laws of multiplication. As is evident from the introduction of multiplication and the example 
of three factors, for the multiplication of integers a, b, c the commutative and associative law hold. 
For natural numbers (c > 0) the monotonic law holds: from a < 5b it follows that a-c< b-c. 
This law does not hold for integers; for negative c it follows from a < b that ac > bc, as is shown 
by the following example: 


(+5) < (47), but (+5): (—3) > (47): (—3). 

Division. Division is the inverse to multiplication. Therefore (+-12):(—4) = *is equivalent to 
(—4)> »= +12. But this holds for « = —3 only. Similarly one can derive the rules of division 
for all combinations of signs. In general: 

If dividend and divisor have the same sign, the quotient is positive, otherwise negative; its absolute 
value is equal to the quotient of the absolute values. 


Examples: (+72): (+6)= +12, (+119):(—17) = — 7, 
(—75):(+25) = — 3, (—91): (—7) = +13. 


Since the arithmetical operations for the non-negative integers, that is, the positive integers and 
zero, have been determined just as for natural numbers, the sign + of positive integers can be 
omitted. They are replaced by the corresponding natural numbers and lead to a simpler representa- 
tion; for example, (+9) + (—17) — (+6) + (421) — (—2) =9— 17—-64+21+2=9 or 
7+ (—9) = —63 or (—56) : (—7) = 8. Whether they are then natural numbers of counting character 
or positive integers will be clear from the appropriate context. 

On the history of the integers. The negative numbers are not known in Greek mathematics, but 
first traces can be found in the writings of DIOPHANTos (around 250 A. D.). In India (around 
700 A. D.) the calculation with negative numbers was already completely developed by the Hindu. 
It is interesting that their names for positive and negative are derived from their words for credit and 
debit. The negative integers play an important role in the Hindu theory of equations. 

In Europe the negative numbers gained a foothold comparatively late; the reason is probably 
that the Arabs, who formed the mathematical bridge between India and Europe, refused to accept 
the negative numbers. The break-through was made by Michael STIFEL in his Arithmetica integra 
(1544). The ultimate foundation of the integers within mathematics was not made until 1867, by 
Hermann HANKEL. 


30 1. Fundamental operations on rational numbers 


1.3. The rational numbers Q 


Foundations 


What are fractions? — If 6 apples have to be shared equally among 3 children, one calculates 6:3 = 2 
and then knows that each child receives 2 apples. But if only 2 apples are available for sharing out, 
one has to solve the division problem 2: 3. Within the natural numbers this problem cannot be 
solved. Nevertheless one accomplishes division by resulting to the knife (Fig.). In this case the share 
of each child is indicated by the fraction 2/3. All similar cases of sharing lead to fractions. 


Explanations. Every fraction is of the form < . The numerator p indicates the number of the 


entities divided, the denominator q the number of parts. The fraction line runs horizontally. If no 
confusion is to be feared, a slanting fraction line (solidus) is allowed, for example 3/4, particularly 
in a running text. Fractions whose numerator is 1 are called unit fractions, for example 1/3; 1/8; 
1/12. Fractions whose numerator is smaller than the denominator are called proper; for example, 
2/3, 1/7, 5/9, 10/11. Fractions in which the numerator is greater than or equal to the denominator 
are called improper, for examples 3/2, 16/3, 9/8, 5/5. 


Fractions arise when one 
or several whole entities are 
divided up. 


The denominator 0 is always 
excluded. 


1.3-2 One third is the same as two 
1.3-1 2 apples for 3 children sixths 


If the numerator of one fraction is equal to the denominator of another and vice versa, the two 
fractions are called reciprocal; for example 3/5 and 5/3, 17/6 and 6/17. 

Numerator or denominator of a fraction may be negative, for example, —3/5, —2/—9, 7/—4. 
By the sign rules for the calculations with integers —3/5 = 3/—5 = — 3/5, —3/—5 = 3/5. Usually 
the signs are written before the fraction line or solidus rather than in numerator or denominator. 
With this convention the definitions above of proper and improper fractions also hold for negative 
fractions. The fact that in what follows positive fractions are predominantly used in examples is 
motivated only to simplify the presentation; with the appropriate modifications everything holds 
equally well for negative fractions. 

Equivalent fractions. If one divides an apple into three parts (Fig.) and takes one part, one has the 
same quantity as when one takes two parts of a division into six parts, 1/3 = 2/6. Similarly, for 
example, .2/5 = 4/10, 5/3 = 20/12, 2/3 = 4/6 = 6/9 = --- = 24/36 =--. 

Extension. If two fractions are such that numerator and denominator of one fraction are equal 
multiples of numerator and denominator of the other fraction, for example, 8/9 = 40/45, then the 
second fraction is said to arise from the first by extension: 


To extend a fraction means to multiply numerator and denominator 
by the same number c + 0. 


Cancellation, The inverse process is called cancellation of a 
fraction. 


To cancel a fraction means to divide numerator and denominator by the same number c + 0. 


Every fraction in which numerator and denominator have common factors can be cancelled. 
In 2/7 = (2: 3)/(7: 3) = 6/21 the transition from left to right is extension, from right to left is 
cancellation. Since a cancellation diminishes numerator and denominator, it is usually advantageous 
to cancel fractions as far as possible. The fraction is then called reduced. 


Rational numbers. All fractions that represent the same quantity, that is, can be carried into one 
another by extension or cancellation or both, for example, (3/4, 6/8, ..., 27/36, ...) are combined 
into a single number, a so-called rational number. The fractions 3/4, 6/8, 27/36 are merely different 


1.3. The rational numbers Q 31 


espressions for one and the same rational number. It is customary to write this number in the reduced 
form, in this case 3/4. Consequently, 3/4, like all reduced fractions, has a two-fold meaning: firstly 
it is a fraction, secondly it represents a rational number and stands for the totality of all fractions 
arising from extension, which are different expressions for one and the same number. In computa- 
tions every expression of a rational number can be replaced by any other expression for the same 
number according to circumstances. Fractions with the denominator 1 and those that arise by 
extension from them, like 3/1 = 6/2 = --- = 18/6 = --- are subsumed among the rational numbers 
and are equivalent to integers, for example, 15/3 = 5/1 = 5, 8/8 = 1/1 = 1. Here, too, one ex- 
pression can be replaced, if necessary, by another. The integer 0 is represented by all fractions having 
the numerator 0. 


Order of the rational numbers. Just as for natural numbers and integers, so for any two rational 
numbers r, and rz one and only one of the relations r,; < rz, or r; =/Frz, OF ry > rz holds. For 
positive a,b, c and a < b one always has a/c < b/e and c/a > c/b. For equal denominators the 
fraction with the larger numerator represents the larger number, for example, 2/7 < 6/7. For equal 


ee rs eS 
“J me -1 0 1 2 3 


7 a AB) P18 Number tine 


numerators the fraction with the smaller denominator represents the larger number, for example 
5/9 < 5/6. If for two positive rational numbers one wishes to determine which one is the larger, 
one finds representations for them with equal denominators and compares the numerators; from 
7/12 = 35/60 and 11/20 = 33/60 it follows that 11/20 < 7/12. Two such expressions with equal 
denominators can always be found for a/b and c/d; in any case the product b - d of the two denomina- 
tors is a common denominator. The appropriate numerators are then a: d and b« c. 


= 

—_fi 
; 

ey 2 


~ | Oo) 
= F 

roa 

sir A 


successor relation also fails to hold: nd rational number has an immediate precessor nor an immediate 
successor. 


Between any two distinct rational numbers r, and r,, with r,; < r,, say, there always lie further, 
in fact infinitely many, rational numbers r satisfying r,; <r < r2. 

On account of their order the rational numbers can be represented by points or arrows on the 
number line (Fig.). Here every point is given by one expression for the rational number. In illustrating 
all fractions the totality of expressions for a rational number would occupy the same place. On the 
number line the smaller rational number always stands to the left of the larger. 


Calculations with common or vulgar fractions 


Calculations with rational numbers are explained in the first instance by their expression as 
common or vulgar fractions. For example, one talks of the addition of fractions with different denomina- 
tors; it would be more accurate, though more cumbrous, to talk of the addition of rational numbers 
given by fractions with equal denominators. Subsequent sections then treat the decimal fractions as 
a further expression of rational numbers for which the computations are somewhat different. The 
name common or vulgar fraction, which originally emphasized the distinction to the sexagesimal 
oe nowadays emphasizes the distinction to the decimal fraction and has nothing to do with 
vulgarity. 


Addition and subtraction. The fractions concerned can have equal or unequal denominators. 


Fractions with equal denominators are added or subtracted by adding or subtracting their numerators; 
the denominator remains unchanged. 


i ers Examples: 1. 3/7 + 5/7= 8/7. 2. 4/11 — 7/11 = —3/11. 
2 a 3. 5/17 + 9/17 — 18/17 + 13/17 — 2/17 = 7/17 


Consequently, every improper fraction can be split into two summands of which the first is an 
integer and the second a proper fraction: 8/7 = 7/7 + 1/7, 22/5 = 20/5 + 2/5. Improper fractions 
are therefore frequently written as mixed numbers, for example, 8/7 = 11/7; 22/5 = 42/5; an addition 
sign between the integer and the proper fraction has to be imagined. 


32 1. Fundamental operations on rational numbers 


Fractions with unequal denominators are added or 
subtraced by first writing them in forms with equal deno- 
minators and then adding or subtracting the numerators. 


In the simplest case one denominator is a common multiple of all the others; for example, for 
2/3 — 7/12 + 5/4 the number 12 is the least common denominator. Therefore one extends 2/3 
and 5/4 by 4 or 3 so that they have the denominator 12; 2/3 — 7/12 + 5/4 = 8/12 — 7/12 + 15/12 
= 4/3 = 11/3. As a rule, one simplifies by writing the second sum at once in the form of a fraction 
whose numerator is the sum of the individual numerators: 2/3 — 7/12 + 5/4 = (8 — 7+ 15)/12 
ao 4/ 3=1 / 3° 

For 1/6 + 3/10 — 11/15 a common multiple of the individual denominators has to be found 
first. This can be done by guessing, say 60, or by taking the product. The fractions then have to be 
extended in turn by 10, 6 and 4. In order to keep the numerators as small as possible, one chooses 
for the least common denominator Icd the least common multiple lcm of the individual denomina- 
tors. The lcm is found in the usual way, for example by decomposition into prime factors. The 
product representation of the Icd then also indicates the appropriate extension factor e.f. if the 
prime factors of the relevant denominator are omitted; for example, if the lcd is 2-3-5, and the 
last denominator is 15 = 3: 5, then its extension factor is 2. 


Example: 3'"/,, — 3/8 — 11/12 + 2 


= 80/21 — 3/8 — 11/12+ 2 
c.f. 

21 3-7 27=§ 

§ = 2? 3:7 = 2) 

2 14 


1/6 + 3/10 — 11/15 = (5 + 9 — 22)/30 age Ls 
|! | = —§8/30 = “tis. led 23- 3-7 = 168 


In determining the Icd by prime factorization of the individual denominators, integers can be 
ignored, because they have the denominator 1. For example, having found the Icd 168 one obtains 
80 3 11 _ 640 — 63 — 154+ 336 = 759 253 29 
21 8 12 168 168 56 56- 


For 3/5 + 1/4 — 2/9 the denominators have no common factors, they are coprime in pairs. The 
led is therefore the product of the individual denominators; the result is 113/180. 


Multiplication and division. The arithmetical operations of the second kind on common fractions 
are easier to carry out than those of the first kind, because the determination of the lcd is not neces- 
sary. On need not distinguish between operations on fractions with equal or unequal denominators. 


The product of fractions is a fraction; its numerator is the product of the 
numerators, its denominator the product of the denominators. 


Integers are again interpreted as fractions with the denominator 1. In order to avoid large numbers, 
which can lead to errors, one cancels as far as possible before multiplying out. 


Examples: 
pi Big rs ek 2 oe ere 105 4-105 1:15 AS 1 
&. 8 ears; 7 5*#: 20 M6) > Seal oe. ae ee 

9 2 7*9-2 1-3-1] 3 l 1] 17 11-17 1: ] 
a ee ee = -— = — a ES 
+e SS eh 2 oe Pa Pa 
. If a tug pulling barges has the velocity of 4'/, m.p.h., then in 23/, hours it travels 9/2 - 11/4 miles 


Io 


99/8 miles = 123/, miles. 


Example 4 shows that the product of two reciprocal fractions is 1. This property can be utilized 
in a simple definition of reciprocity of rational numbers: 


Two rational numbers are reciprocal to each other if and only if their product is 1. 


Accordingly, for every rational number other than 0 there is a reciprocal, for example —3 and 
—1/3 are reciprocal, because —3 - (—1/3) = 1. 


1.3. The rational numbers Q 33 


The division by a fraction is carried out as multiplication by its reciprocal. 


The validity of this method of division is shown by the following argument: 2/3 : 3/4 = 2/3 - 4/3 = 8/9; 
since division is the inverse to multiplication, the product of quotient and divisor must give the 


d 
dividend, and in fact, 8/9 - 3/4 = 2/3, or generally Sad Raed . 
be ad b 
7 5 7°8 7*2 14 3 - | l 
{ 7 — —_— = ————— a a 6= — ,. 
RE ae aa Sek 5-6. 10 
6 5-7 35 5 _ *.. 16 30 - 39 15-3 45 5 
2 oe ee — oo — = — aes §. . 
J. Sis 1-6 6 a i si 130s 39 13-16 1-8 8 8 
1] 1) li- 12 
ee ee ee 


2 12 12-11 


6. A motor scooter covering 58'/, miles in two and a quarter hours has an average speed 
of 117/2 miles: 9/4 h = 117/2- 4/9 m.p.h, = 2+ 13 m.p.h. = 26 miles per hour. 


Division of rational numbers has been reduced to multiplication; therefore: 


Among rational numbers division — except by zero — can always be carried out. 


Double fractions. The division sign and the fraction line or solidus are generally interchangeable, 
for example, 2:3 = 2/1: 3/1 = 2/1 - 1/3 = 2/3. Consequently every division of common fractions 
can be represented as a double fraction whose numerator and denominator are not integers but 
fractions. Conversely, every double fraction can be simplified by division. But an expression is 
uniquely determined only when the principal fraction line is clearly indicated, for example, by its 
length or by the position of a following equality sign. 


| RE Se ee i 
Examples: Lage 333 . ae a we at Mae oe eke 
= Pe ee os a eee 
3 2 
3 15 7 3 RM ee 2 8 16 I 
—e SS a —= « So —— | — = _ aoe = | 
<4 eee A Ae ee ee [5 
eH 8 


The commutative and associative law of addition and of multiplication as well as the distributive 
law, which were previously stated for integers, also hold for computations with rational numbers. 


Decimal fractions 


Foundations. In a positional system the digits within a number symbol have a positional value 
apart from their numerical value; for example, in 3752 the 5 indicates by its position 5 Tens (T). 
In a decimal positional system every positional value is 1/10 of that to its left. As long as only natural 
numbers or integers are presented in this system, the units (U) must occupy the last position. 

The system can be continued beyond the units to represent the rational numbers. After the units 
one has from left to right the positional values one tenth t, one hundredth h, one thousandth th etc. 
If previously the number of positions was bounded to the right by the units, it is now unbounded 
in both directions. Here a particular position must be distinguished, as point of reference, as it were. 
For this purpose a stop is placed between the units and the tenths. 

This agrees with the treatment of decimal measures: 7.5cm means 7 cm and 5 mm because a 
millimetre is a tenth of a centimetre and 3.75 m means 3 m and 75 cm because 1 m and 100 cm are 
the same length. 

The places after the decimal point are read individually (for 2.31 read two-point-three-one) because 
otherwise ambiguities could occur; for example: which number is larger: three-point-eleven or 
three-point-nine? — The places after the point are called the decimals. The first decimal therefore 
represents tenths, the second hundredths etc. The number 4.81 has three places but only two decimals. 
Every number other than an integer, written down in the decimal system, for example, 0.375 or 
17.8, is called a decimal fraction. This has to be distinguished from a vulgar fraction whose denomina- 
tor is a power of 10, that is, 10, 100, 1000 etc., for example, 3/10, 17/100 000. 


34 1. Fundamental operations on rational numbers 


Transformations. From vulgar fractions to decimal fractions. Every fraction whose denominator 
is a power of 10 can immediately be written as a decimal fraction by putting its numerator into 
the position in the decimal system indicated by the denominator. 


Examples: 1. 3/10 = 0.3. 2. 23/100 = (20 + 3) / 100 = 2/10 + 3/100 = 0.23. 
3. 70°95), oo9 = 70. 105. 


Since 10 = 2: 5, all powers of 10 contain only the prime factors 2 and 5. Therefore all fractions 
whose denominator contains no other prime factors can be extended so that the denominator is 
a power of 10 and can then be written as a decimal fraction: 7/20 = 35/100 = 0.35; 13/3 = 1375/1000 
= 1.375. Such a transformation is impossible if the denominator of the common fraction contains 
in the reduced form prime factors other than 2 and 5. Such vulgar fractions cannot be written as 
decimal fractions in the previous form. Here the following arguments are helpful. Within the domain 
of natural numbers the division 2 : 7 cannot be performed. In the domain of rational numbers there 
are two possibilities: 


a) 2:7 = 2/1:7/1 = 2/1- 1/7 = 2/7; 
b) 2: 7 = 0.285 714 28 ... 


0 Mentally: 
=o 2 divided by 7, quotient 0, remainder 2 = 20/10 
pal 20/10 divided by 7, quotient 2/10, remainder 6/10 = 60/100 
40 60/100 divided by 7, quotient 8/100, remainder 4/100 = 40/1000 
50 etc. 
10 
30 
20 


The rearrangement of the remainder to the next power of 10 corresponds to advancing by one 
decimal. The method goes like written division of natural numbers; on transition from units to 
tenths in the dividend the same has to be done in the quotient, that is, a point has to be placed. 

The two results of the division 2: 7 are equated, 2/7 = 0.285 714 28 ... On division by 7 the only 
possible remainders are 1, 2, 3, 4, 5, 6. The remainder 0 is excluded because none of these numbers 
multiplied by 10 is divisible by 7. This means that the decimal fraction is infinite, that the sequence 
of its digits never breaks off. These digits must repeat as soon as a remainder occurs for the second 
time. The decimal fraction is periodic, and on division by 7 the period can have at most 6 digits. 


pth Pes widhetarnpay on fo ains prime numbers other than 2 and 5, then the appropriate 
decimal fraction is periodic, deed lis parlog has at most g — 1 digits. 


Periodicity is indicated by writing down the period once only and placing a bar on top: 
1/3 = 0.33... = 0.3; 34/999=0.34; 17/12 = 1.416; 11/26 = 0.4230769 


(read nought-point-three-four-period-three-four ot one point four-one-six-period six). 

In the first two examples the decimal fractions are purely periodic: the period begins immediately 
after the decimal point. The last two examples have digits between the decimal point and the begin- 
ning of the period; such decimal fractions are called mixed; they arise always if the denominator 
contains among others the factors 2 or 5. 


From a decimal fraction to a common fraction. The transformation of a finite decimal fraction 
into a common fraction follows from its definition, for example, 0.17 = 1/10 + 7/100 
= (10 + 7)/100 = 17/100; 6.05 = 605/100 = 121/20. One places the digits of the decimal fraction, 
omitting the point and its “‘initial zeros” as the numerator and as the denominator the power of 
10 that corresponds to the number of decimals. Also, every periodic decimal fraction can be trans- 
formed into a common fraction. For a purely periodic decimal fraction one places the digits of the 
period into the numerator, and for the denominator one takes the power of 10 corresponding to the 
length of the period, diminished by one; for example 0.3 = 3/9 = 1/3; 0.27 = 27/99 = 3/11; 
0.253 = 253/999. 


Examples: 1. plg= 0.369 2. pig= 0.358 
1000p/q = 369.369 100p/q = 35.858 
999p/q = 369 99p/q = 35.5 = 355/10 
pla = 369/999 p/q = ap le = 355/990 
pPla= 41/111 pia = 71/198 


1.3. The rational numbers Q 35 


This method of transformation is based on the fact, so far unproved, that calculations with 
infinite periodic decimal fractions can be performed just as with finite ones. The application of the 
method, which is illustrated in the examples on the left, also assumes some knowledge of the working 
with equations. The application of the rule stated leads to the same result if one splits and trans- 
forms: 

= -_ — _ 3 58 1-297 +58 355 ~~ ~=671 

Every common fraction can be written as a finite or a periodic decimal fraction. Every finite and 
every periodic decimal fraction can be transformed into a common fraction. Common fractions on 
the one hand, and finite or periodic decimal fractions on the other hand, are two distinct ways of 
writing the same kind of number: the rational numbers. 


In order to avoid the distinction of the two possibilities — finite or periodic decimal fraction — one 
can argue as follows: by the method indicated it can be shown that 0.9 = 1. Therefore every finite 
decimal fraction and accordingly every integer can be transformed into a periodic decimal by 
diminishing the last non-zero digit by 1 and attaching the period 9: 


0.84 = 0.839; 3.156 = 3.1559; 17=— 16.9. 


Computations with decimal fractions 


Only finite decimals are treated here; periodic decimal fractions have to be rounded off suitably 
before the computation; or one has to calculate with common fractions. 


Addition and subtraction. In written addition and subtraction of decimal fractions one proceeds 
just as for natural numbers or integers: equal places are written one under the other, decimal point 
under decimal point, one proceeds column by column from right to left, taking account of the 
appropriate transfer, and on transition from tenths to units the decimal point is placed in the result. 
The simplicity of the arithmetical operations of the first kind is an essential advantage of the decimal 
fractions compared with common fractions. 


Examples: J. 713.25 2. 38.023 Example: 0.175 - 3.5 
+ 1.085 —9.13 §25 
+22.9 —0,0258 875 
737.235 28.8672 ~ 0.6125 


Multiplication. Every finite decimal fraction can be transformed to a common fraction whose 
denominator is a power of 10. The multiplication of such fractions can be carried out in the usual 
way, for example, 0.175 - 3.5 = 175/1000 - 35/10 = (175 - 35)/(1000 - 10) = 6125/10000, without 
any cancellation. The denominator of the result is again a power of 10; the numerator is the product 
of the individual numerators, and when the result is changed into a decimal fraction, there are as 
many decimals as in the factors taken together. The calculation on the right gives the same result. 


Two decimal fractions are multiplied by multiplying irrespective of the decimal point as for natural 
numbers and attaching to the result as many decimals as the factors have, taken together. 


Multiplication by powers of ten is performed simply by changing the points by as many decimals 
to the right as the power has zeros: 7.136° 100 = 713.6. 


Division. A quotient remains unchanged when dividend and divisor are multiplied by the same 
number (see Extension of fractions), for example, 12:4 = 48:16 = 120: 40 = 1.2:0.4=6:2=3. 
In order to imitate the division of natural numbers, one rearranges dividend and divisor, making 
use of this fact, so that the divisor becomes an integer, preferably by multiplication by a power of 10, 
for example, 33: 6.5 = 330:65; 6.729: 13.58 = 672.9: 1358. 

Now the division can be performed just as with natural numbers, and on transition from units to 
tenths in the dividend the same transition is made in the quotient by placing a decimal point. 


2. 714.5: 100 = 7.145 If a number a is to be multi- 

3. 1.92: 1000 = 0.00192 plied (divided) by a power b of 
10, say b= 10*, then the decimal 
point of a is shifted to the right 
(left) by as many places as b has 
zeros, namely k. 


36 1. Fundamental operations on rational numbers 


Abbreviated methods of calculation. Multiplication and division of decimal fractions, in general, 
yield results having more places than the initial numbers. If the initial numbers are not absolutely 
exact, but are approximations, rounded off or carrying measuring errors, then these places are in- 
admissible or, more precisely, meaningless, because they give a false impression of a non-existing 
calculating or measuring accuracy (see Chapter 28.). 


In calculations of the first kind with approximate numbers the result must not show more reliable 
decimal places than the smallest number of decimal places among the original numbers. On multi- 
plication and division the result has only as many valid digits (not decimals!) as the original number 
with the smallest number of valid digits. 


Valid digits of a number are all its digits except the zeros before the first non-zero digit: for 
example 307.6 as well as 0.0002643 have four valid digits. To save the calculation of places going 
beyond the reliable number of places one uses abbreviated methods in which the result straightaway 
has only the required or the admissible number of places. 


Abbreviated addition and subtraction. If the summands have the same number of reliable decimals, 
then addition and subtraction proceed in the usual way. If the decimal numbers are unequal, then 
one proceeds as follows: let k be the smallest occurring number of decimals, then all values given 
with greater accuracy are rounded off to & + 1 decimals and are then added or subtracted, where 
the last place is only taken into account at the transfer, so that the final result has k decimals. If 
for the sum or difference initially an accuracy up to the Ath decimal is postulated, then one chooses 
tthe summands also, as far as possible, accurate up to the (K + 1)th decimal. 


Example: Example: 27.8673 - 49.23 278.67 + 4.923 
2.7362 2.736 1114692 111468 

+ 0.8749 0.875 2508057 25080 

+17.53 17.53 557346 557 

+ 8.665 8.665 836019 Bante 

. 29.81 1371.907179 1371.9 1372 


Abbreviated multiplication. What matters here is not the number of decimal places but of valid 
digits. If one factor has & valid digits and the other more, the latter is rounded off to A + 1 digits 
(1 extra place). It is advantageous to write the factor with k + 1 digits as multiplicator and, particul- 
arly for large k, to see to it by suitable transformations that the multiplicand only has units before 
the decimal point, for example, 27.8673 - 49.23 = 278.673 - 4.923. The subsequent procedure is 
best illustrated by placing next to one another the normal and the abbreviated method of multi- 
plication. 

While in the first multiplication the entire multiplicator occurs, in the subsequent partial products 
place after place are omitted. To avoid errors one marks each time the digit that is only used for 
carrying to the next higher place, by setting a stop over it. For example, the third partial product 
digit 6 is marked, and one calculates: 2-6= 12 (carry 1), 2°-8+1=17, 2°7+1= 15, 
2:2-+ 1 = 5. In the final addition of the partial products the last place is only taken into account 
for carrying. If then the final result, as in the example above, still shows one place beyond the maximal 
number of valid digits, a further rounding off has to be done. If in the calculation of a product 
accuracy of k places is postulated, then one performs the calculation, as far as possible, with factors 
having each k + 1 valid digits, and then rounds off. 


Abbreviated division. The number of valid digits to be shown in the quotient is also equal to the 
smallest number of valid places in the original numbers (dividend or divisor), so that one can round 
off from the outset leaving one extra place. If the quotient is required to k places, one chooses, if 
possible, k + 1 places in dividend and aivisor; in the Ath place of the quotient rounding off has to 

be observed. To illustrate the method the abbreviated 


Example: sites division is again placed next to the normal division. 
1.535 1.54 The quotient 674.283 divided by 439.17 has to be calcu- 
3917) 67428.3 53 y 6743 lated to three places. 

43917 oe” 4092 Ae Instead of adding zero each time to the remainder, 
, — in the abbreviated division the divisor is shortened each 

235113 2351 time by one place. However, this place has to be taken 

219585 2196 into account for carrying in the formation of the inter- 

155280 155 mediate product: 2351: 439 = 5;5-2= 10 (carry 1); 

131751 176 5§-94+1=46;5°34+4=19;5-4+1=21. At the 

235290 —21 next step: 155: 44 = 4, because 44-4 = 176 is nearer 


to 155 than 44-3 = 132. 


1.4. Proportionality and proportions 37 


Historical remarks. The theory of the common or vulgar fractions and the calculations with 
them, as they are conducted nowadays, is the achievement of the Hindu (BRAHMAGUPTA). 
From there fractions came to us by way of the Arabs and the Italian merchants. However, already 
the arithmetic book of AHMEs (Papyrus Rhind, about 1700 B. C.) exhibits a remarkable well developed 
calculation with fractions. Apart from 2/3, only unit fractions are used, and all other fractions are 
transformed to them, for example, 5/6 = 1/2 + 1/3. The transformations themselves are made less 
on the basis of definite rules than by compiled tables; therefore calculations with fractions were 
comparatively tedious. The Babylonians used sexagesimal fractions, derived from the division of 
time and angles. In a certain sense these fractions are predecessors of the decimal fractions, because 
they are built on the positional system with the basis 60, which however was not fully developed. 
Owing to the fact that no denominators were written down, the calculations became comparatively 
simple. The Greeks did not develop a system of fractions. That of the Romans is meagre, strictly 
speaking, it only knows fractions with a denominator 12, derived from the measure of weight 1 as 
= 12 ounces; other fractions were approximated by fractions with a denominator 12. In Germany 
vulgar fractions did not come into common use until the Middle Ages; but is was about 1700 when 
calculations with fractions were introduced into the school syllabus. Even then at first only the most 
necessary parts were Offered, as a rule without foundation and in the form of mnemonics. Decimal 
fractions appeared comparatively late. The founder of the theory of decimal fractions was the 
merchant and engineer Simon STEVIN (1548-1620). In his book, which marked the breakthrough 
of decimal fractions appropriate to the decimal positional system, he postulated among other things 
the introduction of decimal monetary systems and weights and measures in all countries. But 
STEVIN had precursors; among them above all Johannes REGIOMONTANUS (1436-1476), VIETA 
(1540-1603) and Christoff RUDOLFF (born around 1500). 


1.4. Proportionality and proportions 


Direct proportionality. The heavier the suspended body, the greater is the extension of a helical 
spring (Fig.). In a certain spring a load of x units of weight caused an extension of y units of length: 


x 50 100 125 175 240 300 


y 10 20 25 35 48 60 


From the numbers x for the load the numbers »y for the 
appropriate extensions arise each time by multiplication by 
0.2; that is, y = 0.2x or y/x = 0.2. 

In general, two quantities x and y are said to be directly 
proportional if 1. to every value of one quantity there corre- 
sponds exactly one value of the second quantity and if 2. 
from every measure of x the appropriate measure of y arises 
by multiplication by one and the same real number c. 


If this connection is represented in a rectangular coordi- 
nate system, the points (x, y) lie on a line through the origin. 
The number c is called proportionality factor. It character- 
izes the prevalent practical situation. In the example the 
spring constant c = 0.2 is characteristic for the spring used. 


Indirect or inverse proportionality. If in a transmission | 
(Fig.) one of the two wheels has a diameter of 20 ins. and (J 1.4-2 Transmission 


rotates once, the rotation number y of the other wheel is the 
larger, the smaller its diameter x in inches: 


x 4 5 10 15 20 30 1.4-1 Spring balance 


y 5 4 2 4/3 1 2/3 
For corresponding values of x and y the relation y: x = 20 or y = 20/x always holds. The same 
relationship holds in the force between the two wheels transmitted by friction or by cogged teeth. 
In general, two quantities x and y are said to be inversely proportional if 1. to every value of one 
quantity there corresponds exactly one value of the second and if 2. from every measure of x the 
corresponding measure of y is obtained when one and the same real number c is divided by the 
measure of x. 


38 1. Fundamental operations on rational numbers 


If this relationship is represented in a rectangular coordinate system, the points lie on an equi- 
lateral hyperbola. Here the number c, on account of y = c/x = c- 1/x, is also called proportionality 
factor. 


Ratio. A train travels in one hour 80 miles, a plane 400 miles, that is, 320 miles more. The com- 
parison becomes clearer if one says that the plane covers five times the distance of the train. In 
this form it is independent of the time interval. One obtains the number 5 as quotient 
400 : 80 = 5: 1 = 5, and one says: the distances covered in equal times are in the ratio 5: 1. 


The ratio of two quantities of the same kind is the quotient of their measures. 


For numbers instead of quantities the ratio is defined accordingly; in both cases the ratio is a 
number. 

One can also form the ratio of quantities of different kinds. If a man walking takes 4 hours to 
cover 11 miles one forms the ratio 11 m: 4h = 11/4 m.p.h. (read eleven over four miles per hour). 
In this case the formation of the ratio leads to the new concept of velocity with the measuring unit 
m.p.h. or m h7?. 

In direct proportionality the associated values always have the same ratio, in an inverse propor- 
tionality they have the same product. 

Since a quotient does not change when dividend and divisor are multiplied or divided by the same 
number c + 0, one and the same ratio can be given in various way, for example, 5:1 = 10:2 
= 30/6 = 1: 0.2 = 650: 130. One also speaks of extending or cancelling a ratio. As a rule, one 
chooses the expression with the smallest natural numbers, for example 5: 1. 


Equality of ratios or proportion. An equality of two ratios is called proportion, for example, 
2:3 = 1:1.5 or 4/5 = 8/10; this is read: four is to five as eight is to ten, or briefly, four to five as 
eight to ten. A proportion is true or valid if the same ratio, but differently expressed, does, in fact, 
stand on the two sides; 4: 5 = 5: 4 is a false proportion. 

If a valid proportion has the same inner terms, this quantity or number is called the middle propor- 
tional of the outer terms; for example, since 12: 6 = 6:3 the number 6 is the middle proportional 
to 12 and 3. By the product equation (see Theorems on proportions) the geometric mean m, = (a - 5) 
of two positive numbers a and 5 is their middle proportional. 


—louter terms 


If the hind terms of a proportion are equal to the fore terms of another, one frequently writes a 
continuous proportion, for example, for 2:5=4:10 and 5:8=10:16 one writes 2:5:8= 4:10:16. 
However, this is merely a symbolic notation, for if one were to interpret the two sides as quotients, 
one would obtain the wrong statement 1/20 = 1/40. In general, a: b: c = d: e: fis an abbreviation 
for the three proportions a: b = d:e,b:c=e:fand a:c=d:f, of which each is a consequence 
of the other two. By interchanging the inner terms of the first two, say, one obtains a: d= b:e 
and b:e = c:f, consequently a:d=c:fora:c=d:f, that is, the third proportion. 

By multiplying the first two proportions by b/d or c/e one obtains the chain of equations 
a/d = b/e = c/f from which, conversely, the proportions may be obtained: for example, the sine 
theorem of plane trigonometry can be put in the form a/sin« = b/sinB = c/siny or a:b: c¢ 
= sina: sin Bf: sin y. 


Theorems on proportions. Every proportion may be transformed like an equation, for example, 
by interchanging the two sides. But there are also special rules leading from one valid proportion 
a:b =c:d to another valid statement. 

If one multiplies a/b = c/d on both sides by bd, one obtains the product equation a- d= b-c. 


Product equation. In every valid proportion the product of the inner terms is equal to the product 
of the outer terms. 


Conversely, from the equality a- d= b-c (+ 0) one can obtain proportions. Division by b- d 
yields a: b = c: d, division by a: b yields d: b = c: a etc. This leads to the exchange theorems. 


Exchange theorems. In every valid proportion the interchange of the two outer terms, the two 
inner terms or the inner terms with the outer terms leads to another valid proportion. 


1.4, Proportionality and proportions 39 


If ina:b =c:d or in b:a = d:c one adds or subtracts 1 on both sides, then by corresponding 
addition one obtains (a + 5): 6 = (c + d):d and (a + b):a = (c + d): c, and by corresponding 
subtraction (a — b):b = (c — d):d. 

Division of corresponding proportions for a= 6b, hence c+ d, leads to (a+ b):(a— b) 
= (c+ d):(c — d). These formulae are only the most important particular cases of the general 
law of corresponding addition and subtraction. 


The validity of this statement can be seen by substituting in it the values a = bk and c = dk ob- 
tained from a: b = c: d= k and then cancelling by 5 or d. 


Proportionality and proportions. If direct proportionality holds between two quantities x and y, 
then y;/x; = y2/x2 = ++: = Yn/Xn = c for any n associated values x;, y,;. Repeated interchange of 
the inner terms leads to y; : ¥2:¥31°° 2 Vy = X11 X21 X31-++:xX,. This means that in a direct 
proportionality the associated values y, and x; always have the same ratio and that any two values 
x, and x, have the same ratio as the associated values y, and y,. 

In an indiréct proportionality between the quantities x and y associated values x, and ),; satisfy 
X.Y, = X2° 2 = + = Xn Yn = C. From all these product equations one can Duels proportions 
such aS y,;:¥2 = X2:X, and from them finally y,; : y2: y32-+: iy, = X_i 0 1 X31X%2:xX,. This 
means that in an indirect proportionality any two values x; and x, have the i inverse ratio of the 
associated values y, and y,. 


Solutions of proportions. The proportions 50: 140 = 10:x and x:2 = 50: 80 contain one 
variable each. To solve them means to find numbers giving equal ratios on substitution for x. One 
also talks of the task of determining the fourth proportional. In the first case one sees immediately 
that x = 28, in the second x = 5/4. The correctness can be checked by means of the product equa- 
tion. In difficult cases one uses the product equation to find a solution 


Example J]: (8x — 7): (4x — 1) = (6x — 5):3x Check: 


By means of the product equation one left-hand side (8-1 — 7): a , —1)=1:3 
obtains right-hand side (6°11 — 5):3 =1:3 
3x(8x — 7) = (4x — 1) (6x — 5) Comparison 1:3—1:3 
24x? — 21x = 24x? — 26x +5 

5x = 5 

x = ] 


Example 2: Of two bodies of equal volume the first has the density 9, = 7.3 oz./cub. in., 
02 = 2.7 oz./cub, in. What is the mass of the second body if that of the first is 4.8 Ibs.? - 


4.8: x = 7.3: 2.7 
x = 1.775. The mass of the second body is 1.775 Ibs. 


Example 3: A wire of length /, = 400m and diameter dj, = 4mm has the mass m, = 36.7 kg. 
How many meters of wire of the same material, but of diameter d) = 6mm have the mass 

= 90 kg? — Since the wires consist of the same material, their masses are in the ratio of their 
volumes. Hence 


my, : mz = (d?/4) al, : (d?2/4) al, By the product equation the solution is: 
Mm, : i, = d?i, : d3l,. i m,d7l; 
xa m,d2 


in which corresponding quantities (/, and /,; d, and d,; m, and m,) must be measured in the same 
units, In this numerical example one calculates the length of 436 m. 


Example 4: A service station has enough fuel for 24 days if the daily 24: x = 1200: 1000 


sale is 1000 gallons. For how long does the fuel last if the daily sale x= 24 1000 
is 1200 gallons? — 1200 
The fuel lasts for 20 days. x= 20. 


Historical remarks. The theory of proportions occupied a central position in ancient mathematics, 
because the most diverse problems lead to proportions. 

In Greek mathematics the fourth proportional was constructed geometrically, by the method 
of geometrical algebra. But the computational treatment of proportions and the calculus of the 
‘rule of three’ were not developed in Europe until the 15th-17th century, in particular, in con- 


40 1. Fundamental operations on rational numbers 


nection with commercial calculations. Such problems form one of the main constituents of the 
widely used arithmetic books and the principal teaching topic of the arithmeticians and ‘cossists’. 
The best known among them is Adam Ries (1492-1559). 

Proportions also played an important role in the visual arts of the Renaissance. To be esteemed 
as beautiful, buildings and representations of human beings (in paintings and sculptures) had to 
be built according to a specific ‘canon’, that is, parts of the total had to stand in definite ratios. For 
example, head to body length = 1:8; head: face = 5:4; trunk: thigh = thigh: leg; height of 
a building: width = 3:7 etc. An important role was also played by the Golden Section (see 
Chapter 7.). Even today the word ‘well-proportioned’ is used in the sense of ‘satisfying the 
aesthetic sense’. During the Renaissance, above all Leonardo DA Vinci (1452-1519) and Albrecht 
DUrRER (1471-1528) occupied themselves with this branch of the visual arts (see Table 18). 


1.5. Working with numerical variables 


The fact known as the distributive law for rational numbers is expressed briefly in the form 
a-(b+c)=a:b-+a+-c. Here a,b and c stand for arbitrary rational numbers, they are variables 
for rational numbers. In general, variables, which are usually represented by letters, represent an 
empty space into which an arbitrary element (or its symbol) from a fixed set can be substituted. 
The letters a, b and c above are numerical variables (occasionally called general numerical symbols) 
for which the set of rational numbers is their special domain of variability. 

Variables are useful in two ways: they make it easy to state laws, and the solution of a problem 
expressed in terms of variables yields the result for arbitrarily many individual cases without new 
calculations, by mere substitution. 

An expression is a combination of numerical symbols, numerical variables, and (meaningful) 
juxtapositions of such symbols with operational symbols and brackets, for example, 1/4; 12 — 5; 
3-a; 5-(17 + 60); (2z — 13): (Sz + 10). However, the combinations 5:0 or 7 + - a( are not ex- 
pressions; and 7 + 8 = 15 or a — 3 < 3a are not expressions but propositions. In the expression 
(2z — 13)/(Sz + 10) the set of rational numbers may be the domain of variability of z. But the domain 
of definition of the expression differs from it, because the expression is not defined for z = —2 
(see Chapter 4.). 

In a specification of a variable in an expression a particular element (or its symbol) of the domain 
of variability is substitued at every place where the variable occurs. Distinct variables can be specified 
by means of distinct elements or by one and the same element; for example, for a = b = —3/7 
the expression 2a + 5b yields the value —3. 

Two equivalent expressions contain the same variables, and for every specification of these variables 
by equal elements the two expressions take the same value; for example, 3a + 7a and 10a are 
equivalent, and so area:(b+ c)anda:b+a-c. 

Simple transformations. A proper calculation with variables, such as with integers, for example; 
a calculation of the sum 2a + 36, is impossible. Expressions with variables can only be transformed 
into equivalent expressions, for example, 3a + 7a into 10a; here the same laws hold as for the cal- 
culation with numbers in the appropriate domain of variability. 


Although for expressions only such transformations are possible, names such as sum or product 
are also used for them. 

Under addition and subtraction only terms of the same kind with the same variables can be con- 
tracted, in accordance with the distributive law. For example, Sa — 2a = (5 — 2) a = 3a, or by 
an additional exploitation of the commutative and associative law of addition Sa + 7c — 3b + 6c 
— 2a — 7b — Sc = 3a — 106 + 8c. Also, under multiplication and division expressions with 
distinct variables, for example, m-n or s: ft, are not calculated or contracted; for equal factors the 
notation of powers is used. In products it is customary to omit the multiplication sign between the 
variables and between numerical symbols and variables, for example, to write ab instead of a- b 
or 6(p + q) instead of 6-(p + q). 

Examples: 1. 4m + 3n+ 15k = 180kmn, 2. (—320pq) : (—80q) = 4p. 

3. 125¢? + (—3/7d)- 14/75ed = —10c3d?. 4. 93s71*: 31st? = 3sr?. 


Algebraic sums. The notion of opposite numbers can also be transferred to variables and expressions. 
And then again every subtraction can be represented as an addition. Algebraic sums with variables 
are frequently called polynomials (Greek poly, many), but this name is also used in a different 
meaning. A monomial (Greek mono, single) is then an expression with one term, a binomial (Latin 
bi, double) contains two terms, a trinomial (Latin tri, triple) three terms. 


1.5. Working with numerical variables 4] 


Lexicographic order. Owing to the validity of the commutative laws the order of summands or 
factors is arbitrary, but it is customary for the sake of clarity to order the variables as far as possible 
according to the order in the alphabet or lexicographically, as this has been done in all the preceding 
examples; instead of 28b7af2d it is better to write 28ab*df?, instead of 36vw + 2.5uv — 3.2uw 
better 2.5uv — 3.2uw + 36vw. If the same variable occurs several times with distinct exponents, 
one orders as a rule by falling or occasionally by rising powers; for example, instead of 
2s? — 35+ + 55 — 8s better s> — 3s* + 25? — Bs. 


Working with algebraic sums 
Addition and subtraction. Brackets can occur in the addition or subtraction of algebraic sums, 
for example, (7a — 3b) + (Sc — 3b — 6a) — (7b — 8a + 2c). Contractions and simplifications are 
not possible until the brackets are dissolved. 
Dissolution of brackets. A numerical example for the written way of solving mentally an addition 
and subtraction problem illustrates the method: 
227 + 36 — 213 — 198 + 29 
= 227 + (30 + 6) — (200 + 13) — (200 — 2) + (0 — 1) 
= 227+ 30+ 6 — 200— 13 — 200 +2 + 30—1. 
If a plus sign stands before a bracket, the bracket can be omitted. If a minus sign stands before it, 
on omission of the bracket all signs and operational symbols occurring in it have to be reversed. 


For example, if 6a — (4a — 5) is to be calcu- 
lated, then the number to be subtracted from 6a Example: 


is smaller by 5 than 4a. Therefore, if one subtracts | 
4a, one has subtracted b too much, so that b 8p — (15r — 7q + 6p) + (84 — p + 7) 
must be added, and one obtains 6a — (4a — b) = = 8p — 15r + 7g — 6p + 8—p+ir 


6a — 4a + b=2a-+ b. Similar arguments hold for = p+ 15q— 8r. 
the other cases. 

Multiple brackets. If in an expression algebraic sums are again contracted, then one distinguishes 
the brackets suitably by different forms. In such cases it is frequently advantageous to begin with 
a dissolution of the inner brackets: 


17m + [68 ) Gm+4n) J — { (m—n) — m+ Gn—6m) J } 
17m + [or =smaay) — 8m—n— “san Gm) } 
l7m + [—3m+2n] — {8m—n— [—m-+3n] } 
= 17m — 3m + 2n— {8m — n+ m— 3n} 
= l4m+ 2n— {9m — 4n}) = 14m + 2n — 9m -+- 4n 
= 5m + 6n. 


If the outer brackets are dissolved first, one obtains the same result: 
17m + [6n— (3m+4n) J] — (8m—n) + [Sm4+ (3n—6m) ] 
= 17m + 6n — (3m + 4n) — (8m — n) + Sm + (3n — 6m) 
= 22m + 6n — 3m — 4n — 8m + nh 4+ 3n — 6m 
= 5m + 6n. 


Multiplication. Algebraic sums can be multiplied by a number, a monomial, or again by an al- 
gebraic sum. 

Multiplication by a monomial. Here one is concerned with an application of the distributive law. 
As to the operational symbols, one only has to keep clearly in mind that every subtraction can be 
changed into an addition of the appropriate opposite number and vice versa. 

Because a(b-+c)=ab-+ac one has a(b—c) = a[b + (—c)] = ab + a(—c) = ab + (—ac), 
therefore a(b — c) = ab — ac. 


The sign rules known from calculation with integers have to be observed. 


Example: 6x + 7(3x — 2y) — 5x(3 — 6y) — 3y(10x + 9) 
6x + (21x — 14y) — (15x — 30xy) — (30xy + 27y) 
6x + 21x — 14y — 15x + Oxy — Oxy — 27y 

12x — 4ly. 


42 1. Fundamental operations on rational numbers 


If one is sufficiently skilled in the use of the sign and operational rules, one can go from the first 
line straight to the third. 


Multiplication of algebraic sums. The rules for the procedure of multiplication of several algebraic 
sums is obtained by repeated application of the distributive law, taking account of the sign rules 
(Fig.). 


es f 
— : — 
i ng i 
| 


| Algebraic sums are multiplied by multiplying every term 
of one sum into every term of the other and adding these 
products. 


! 1.5-1 Illustration of the multiplication of two binomials if 
a, b, c, d are positive (a + b)(c + d)=ac+ad+ bce + bd 


— - -_— = 


In the subsequent examples sums of several terms occur; for more than two factors one proceeds 
step by step. 


Examples: d. Sh — 3v) (4u + Sr) 2. (2s — 3r) (5r — 7s + 218) 
28u? + 35uv — 12uv — 15Sv? = 10rs — 14s? + 4s¢ — 15rt + 21st — 61? 
= 28u + 23uv — 15v?. = 10rs — 15rtf — 14s? + 25st — 617. 


3. (u + Tv) (3u + v) (9u — 6v) (2u — v) 
= (3u? + 22uv + Tv?) (184? — 2lue + 6v?) 
= 54u* + 333u7v — 318u?v? — 15uv7 + 420%. 


Factorizations. The distributive law a(b + c) = ab + ac can be used not only from left to right, 
but also in the reverse direction from the.sum to the product. The procedure is called factorization. 
It is always possible when several summands have equal factors. This factor can be emphasized in 
an intermediate step. The transformation into a product of two algebraic sums usually takes place 
in several steps. 


Examples: 

1. 44p — 77q + 99r = 11° 4p — 11° 7g + 11° 9r = 11(4p — 7g + 9r). 

2. 54a7b*c3+- 18a7b3c? — 36a*hb2c? = 18a*b*c7(3ac +- b — 2). 

3. 18am — 24bm +- 15an — 20bn = 6m(3a — 46) 4+- Sn(3a — 46) = (Ja — 46) (6m + 5n). 


Binomial formulae. A particularly important special case of the multiplication of algebraic sums 
iS is expressed by means of the binomial formulae. For example, (a + 5b) (a + es =a*+ab4+ab-+ b? 
= a* + 2ab + b*. = 


i 
= _ 


The formula for (a — 45)? is superfluous, strictly speaking, 
because it is sufficient to observe the sign rules in applying 
(a + 6)? = a? + 2ab + d?. It also results from substituting 
—h for 6 in this formula (Fig.). 

By applying these formulae the square of an algebraic sum | 
can be written down and a sum can be factorized. In both 
directions one obtains aids for mental calculations. 


he 


Examples: ! ! 
1. (uv — Sow)? = 49u?v? — 70uv?w + 25v?w?. 15-2 | Pe 
2. (5m + */2n) (Sm aa */2n) = 25m? — of Fiat Ilustration of (a — b)? Sa ea 


3. 1.96r? +- 1.4rs + 0.255? = (1.4r + 0.5s)?. 

4. 16a? — 56ab + 49b? — 64c? = (4a — 7b)? — 64c? = (4a — 7b + 8c) (4a — 7b — 8c). 
5. 394 - 406 = (400 — 6) (400 + 6) = 160000 — 36 = 159964. 

6. 2047 = (200 + 4)? = 40000 + 1600 + 16 = 41616. 

7. 477 — 43? = (47 + 43) (47 — 43) = 90-4 = 360. 


1.5. Working with numerical variables 43 


Higher powers. Just as for (a + b)?, so also corresponding binomial formulae for higher exponents 
than 2 exist: 
(a + b)? = a? + 2ab + b?, 
(a + bj? = a 4+ 3a*b 4+ 3ab? + b>, 
(a + b)* = a* + 4a3b + 6a*b? + 4ab? + 54, 
(a + b)® = a® + Sa*b + 10a%b? + 10a7b? + S5ab* + 5°, 
etc. 
If a term of a binomial is replaced by its opposite, then all odd powers of this term have the 
negative sign, for example, (a — b)> = a* — 3a7b + 3ab? — b°. Clearly the exponent of a decreases 
from term to term, while that of b increases, and in (a + 5)" the sum of the two exponents in every 


term is n. The preceding factors are called the binomial coefficients (7) (read n over k). They stand 
for: 


If one puts (3) =|= (") , then the binomial theorem can also be simplified by means of the 


summation symbol (see Chapter 18.). 


Accordingly the value of the fifth term, that is, the fourth mixed term, of (a + 6)® with n = 6, 
k= 41s: 
6 f-aga... O° 9°43 2h4 __ 2134 
Pascal’s triangle. This triangle (Fig.) makes it possible to determine the binomial coefficients 
n 


a It is obtained by writing the coef- 


even for someone who is not familar with the formation of ( 


ficients in triangular form under each other, beginning with (a + b)° = 1 and (a+ 5)! =a+ 5, 
or by using the equation (a + b)"*! = (a + b)"(a + 6). Here the numbers of each row arise by 


adding the two adjacent numbers in the row above, for example, (;) = (3) + (3) = 10+ 5= 15. 
n+ 1 


In general, this relation between the binomial coefficients is (7) -+- ( ‘ = ( k+d 


, On ac- 
count of k+1 


(i) +(e 1) = are + GSES DEE 


_atkKt+D)+@—b) td! wee 
7 (n—k)!(kK +1)! ~— n—-kKi(k+D! \k4+1)° 


Division. Also in division one has to distinguish between division by a number, by a monomial, 
that is, a one-term expression, and by an algebraic sum. 


44 1. Fundamental operations on rational numbers 


Division by a monomial. Substituting d=  1/c in the distributive law (a + 6b) d= ad + bd and 


taking into account that the fraction line can also be regarded as a symbol of division, one obtains 
the rule for the division of a sum by a number. Algebraic sums are divided term by term. Here the 
sign rules have to be taken into account. 


Example: (28m?*n — 63m?n? + 84mn?): 7mn = 4m — 9mn +- 12n. 
Division by an algebraic sum. Frequently in problems requiring the division by an algebraic sum 


a knowledge of factorizations or of the binomial formulae is sufficient. One splits the dividend 
suitably to factors, having rearranged it if necessary. 


Example: (0.54fg — 0.3eh — 0.45fh +- 0.36eg) : (0.2e + 0.3f) 
= (0.36eg — 0.3eh + 0.54fe — 0.45fh) : (0.2e + 0.3f) 
= [0.2e(1.8g¢ — 1.5h) + 0.3f(1.8g — 1.5h)]: (0.2e + 0.3f) 
= [(1.8g¢ — 1.5h) (0.2e + 0.3f)]: (0.2e + 0.3f) = 1.8g — 1.5h. 


If such a transformation of dividend cannot be achieved, which is the case, in particular, if the 
required division leads to a remainder, then one has to adopt the method of stepwise division. 


Stepwise division. This method is merely a generalization of the ordinary written division, in 
which one proceeds in principle in exactly the same way, even if not every step is written down, as 


the example 286 : 22 = 13 illustrates. ; ne 
mpie. a* — l : (2a- = 
SE Example Ose x rf art es = ): (2a + 3x) 
= (200 + 80 + 6): (20+ 2)= 10 +3 (2a + 3x)) (10a3 + 13a?x — ax? + 3x?) 


—(10a3 +. 15a?x) 


— (200 + 20) a on) Bee 
ae 0 — 2a*x — ax? + 3x? 
—(— 2a*x — 3ax*) 
— (60 + 6) 
ae 0 2ax? +- 3x? 
—(2ax? + 3x?) 
In the division of algebraic sums one pro- 0 


ceeds similarly, taking into account that 
dividend and divisor are ordered in the 
same way. Just like writing down zeros, so taking down all remaining terms of the dividend after 
every step can be omitted to save written work. But one has to pay attention that no term is for- 
gotten. Incidentally, not only the division (a> — b>): (a — b) has no remainder; (a” — b") for every 
Example: (@ — 6°):(a — b) = natural number 7 is divisible without remainder by (a — 5), 
a ee while the division (a” -— b"): (a + 5) leaves no remainder 
| PM Tl Sih SE only for even 7. 
(a — b)) (a3 — 53) 


—(a3 — a*b) 


a*bh 
—(a*b — ab?) 
ab? — b3 
—(ab? — b*) 
0 


Division with remainder. Even when a division leaves a remainder, the method is unchanged; 
in writing one proceeds by analogy to 47:5 = 9 + 2/5. 
Example: x? — 4x — 2 : P 
. | eae er yes ; quotient x* — 4x — 2, 
x + 3x+9 es 7 3x3 fe 40x + 7° remainder 2x + 25 
— 4x3 — 14x? 
— (— 4x3 — 12x? — 36x) 


— 2x? — 4x 
—(—2x? — 6x — 18) 
2x + 25 


Fractions with variables 


Extension and cancellation. Extensions and cancellations are only transformations of the rational 
numbers represented by fractions. These transformations are also possible in fractions with vari- 
ables. 


1.5. Working with numerical variables 45 


Extensions. Numerator and denominator are multiplied by the same factor: a/b = (a: k)/(b- k); 
similarly 5m/9n = (3: 5m)/(3:9n) = 15m/27n and 7/(3a + 3b) = (7a — 7b)/(3a? — 3b?) (ex- 
tended by a — DB). 

Cancellations. Numerator and denominator of a fraction are divided by the same expression, 
for example, 6cd/22de = 3c/1le. Here a sufficient skill in fractorization and in the application 
of binomial formulae is particularly important, because inside algebraic sums cancellation is not 
permitted: 


15u? — 24uv 3u(S5u — 8v) 5u — 8v 


12u? 7 12u? ~ 4u 
Of course, mental factorization of numerator and denominator and cancellation of common factors 
is permitted; for example, (p* — 1)/(3p? + 3) = (p? — 1)/3. 


Addition and subtraction. Addition and subtraction of fractions with equal denominators present 
no problem in the calculation with variables; a/c + b/c = (a + b)/c. 


Ti + 5k i — ite Sk — (Si — 4k 
cepts ee Se Oe 


oie | See. ane ae 3k? 
Ti + Sk — 5i 2i + 9k ) 
= AK ERE. (observe brackets!) 


ve 3 


Here, too, intermediate results can be omitted when sufficient skill has been acquired. One has to 
pay particular attention to the signs or operational symbols, because the second fraction is sub- 
tracted. 


Fractions with unequal denominators. Such fractions have to be extended first so that the denomina- 
tors become equal. One usually chooses the least common denominator, that is, the simplest deno- 
minator containing all the factors of the individual denominators. Frequently it can be obtained 
mentally without written work: 


2y 5x yt2e 4°2y + 2° 5x — Hy + 2x) 


Examples: I. 35 oo ry — ae ol 7 12: 
_ By+ 10x—3y—6x 4x -+ Sy 
Ey 12z a see T 
3a 26 3a 2b 4(a — b) +- 3a + 26 Ja — 2b 
OE a ae ca a oe on ae Ee SA eS Sn 
arr b—a nS a—b a-b a—b a—b 


If the denominators are more complicated, it is advantageous to determine the lcd in writing. 
Determination of the least common denominator |cd. In the problem 
3 2u—v 6u — Sv 
Tau — Tv ~ “36u? — Biv? T Bu? + 24uv + 180? 


one determins the Icd as in an ordinary calculation with fractions, by decomposing the individual 


denominators into indecomposible factors: ; 
extension factors 


12u— 180 = 2-3 -(2u — 3v) 3 -(2u + 3v)? 
36u? — 8lv? = 37-(2u— 3v)(2u 4 3v) 2 -(2u + 3v) 
8u2 + 24uv + 18v? = 2: (2u + 3v)? 32. (2u — 3v) 


Iced 9-2: 32. - (2u — 3v) (2u + 3v)? 
For example, in the preceding problem 
37(2u + 3v)? — 2(2u 4+ 3v) (2u — v) + 37(2u — 3v) (6u — 5v) 
18(2u — 3v) (2u + 3v)? ; 


Such a fraction is then simplified further by contracting in the numerator and possibly by can- 
cellation. 


Multiplication and division. Since in the second kind operations with fractions the determination 
of the Icd is unnecessary, the operations are simpler to carry out than those of the first kind. Both 
in multiplication and division attention has to be paid to the possibility of cancellations before the 
operations. 


Multiplication. For the multiplication of fractions one has a/b - c/d = (a+ c)/(b° @). 


46 1. Fundamental operations on rational numbers 


32,2 25p _ 32r2+25p _ 4r-Sp __20pr 
Examples: 1. "Sar "35g: tr Tg? 3 la 
Tp _ 14p 


Division. Division can always be carried out as multiplication by the reciprocal of the divisor, 
a/b: c/d = (a: d)/(b - ¢). 


Te Lae Jom | lam 6 4 AT ay Ae ger eer 
Examples: i,—— OK = Ok? - Fre ie 2. pon We * (12s 127 ) = Jus +t) ‘ 
meet g* ig 3g? + 3h 15e*h 
3 
ae i ~—Heifigy Ig 
Double fractions. Double facione occur when numerator or denominator of a fraction again 
contain fractions: 


e J 2 pte 1 3 1 n? +- 2mn + m? 
7 x wm Te at Sw Se ae ae 
30° Vig’ 2. Exempre: 3 if 3n + 3m 
x mi'son a = += — Se 
The transformation is performed as in 
double fractions with numbers, by regard- ad + 2mn+ n?)mn m+n 
ing the principal fraction line as symbol m?n?(3m -+- 3n) Ie” 


of division. 


Uniqueness of the decomposition of a natural number into prime factors. As an example of how the 
use of variables can demonstrate the validity of mathematical arguments for all numbers within 
a domain, there follows a proof of the theorem of elementary number theory that the decomposition 
of a natural number into prime factors is unique apart from the order. The proof starts out from 
Euclid’s algorithm. For 13013 and 390 or for two numbers a and 5b, one divides the larger one by 
the smaller one, then the smaller by the remainder r, and each remainder by the next one, etc. In 
the section Elementary number theory it was claimed that in the numerical example the number 
13, and in general r,, is the greatest common divisor gcd (13013, 390) or gcd (a, b). The remainders 
r; are each time smaller by at least 1 than the divisor, therefore after finitely many steps a remainder 
’n41 Must be zero. When the equations are read from bottom to top, one sees that r, is a divisor 
of r,_1;, hence of r,_2 etc. therefore also of 5 and of a, that is, r, is a common divisor of a and b. 
From the equation r,; = a — bq, it also follows that every common divisor of a and b divides r,, 
hence rz etc. finally that r,, contains every common divisor of a and 5 and is therefore the greatest 
common divisor; one writes gcd (a, b) = r,. 


13013 = 390-33+4+ 143, a=b:q+4+hn, 
390 = 143: 24+ 104, b=rn-gqtn, 
143 = 104: 1+ 39, ry, =r2°g3+7F3, 
104 = 39> 2+ 26, ccc scccccsesecccces 
39 = 26: 14 13, Pn-2 = Tn-1° Un + Ins 
26= 13: 24 £0, Tn-1 =Tn* Qn41 + 0 


From the last equation but one gcd (a, b) = ry = rn_2— n-19n- Replacing in it r,_, by 
rn—-3 — ‘n-29n—-1 20d ra_2 bY ra_a — n-39n-2 etc. one obtains two natural numbers x and y for 
which gcd (a, b) = r, = ax — by. In particular, if the numbers a and b are coprime, then gcd (a, 5) 
=r, = 1 = ax — by. This leads to the theorem: 


If two numbers a and b are coprime and b divides ac, then b divides c. 


By assumption there is a natural number k such that a = b- k and because gcd (a, b) = 1 one has 
also 1 = ax — by or c = acx — bcy, hence c = bkx — bcy = b(kx — cy) that is, b divides c. 


Corollary: If the product ab is divisible by a prime number p, then at least one of the factors a 
or b is divisible by p. 

Now the uniqueness of the decomposition of a natural number z into prime factors can be proved. 
For if m= p, * P2°**** Pp = 41° 92°**' * Gs are two decompositions, then p, divides the product 
41 °4G2°°'''@s, hence divides one of the prime factors g;. But this is only possible when they are 
equal. By a suitable numbering of the q; it may be assumed that p,; = q,. The same argument holds 
for Po‘ P3 °°: * Pp = 42° 93°°*:* qs and shows that p2 = q2 may be assumed. By a repeated ap- 
plication one finally obtains r = s, and p, = q,. 


2.1. Calculations with powers and roots 47 


Historical remarks. In the initial steps of mathematical activity calculations, theorems and formulae 
were only expressed in words, not in symbols. Since this procedure led to complications and lack 
of clarity, abbreviations were used for objects of frequenty occurrence. For example, the Greeks 
denoted points, lines and surfaces by letters. DIOPHANTOs of Alexandria generally used a letter 
for unknown numbers; it ought to be observed that the Greeks always used letters for digits. 

The development among the Hindu and Arab mathematicians concerned mainly the theory of 
equations. This is the reason why no account of it is given here (see Chapter 4.), although the 
name algebra apart from the theory of equations, is frequently used in elementary mathematics in 
the extended sense of working with numerical variables. 

Letters and variables are used to a greater extent for the first time by LEONARDO of Pisa (1180-1228). 
He also used the fraction line; but symbols for operations were unknown to him. The proper origi- 
nator of consistent working with variables is Frangois VIETE (latinized ViETA, 1540-1603), who 
lived at the Royal court as a law officer. 

René DEscaRTES (latinized CARTESIUS, 1596-1650) also emphasized the importance of variables; 
the present notation for powers is due to him. 

The operational symbols -+- and — occur for the first time in 1489 in an arithmetic book of 
Johannes WIDMANN of Eger; in 1631 William OUGHTRED introduced the symbol x for multiplica- 
tion. The stop for multiplication and the colon for division were introduced by Gottfried Wilhelm 
LEIBNIZ (1646-1716), the sign for equality by Robert RECORDE (1557) (see Chapter 4.). 


2. Higher arithmetical operations 


2.1. Calculations with powers and roots 47 2.2. Calculations with logarithms..... 56 
POWETS oo c ccc ccc cence ccc enaes 48 Logarithmic laws and logarithmic 
Tables for squares and cubes ..... 51 SYSECEMS oo ccc ccc cece cece ececace 56 
ROOES 0.0 cc ccc cece cece eens 51 Practical logarithmic calculations . 60 
Roots as powers with fractional The logarithmic slide rule ........ 65 
CXPONENIS 1... ccc ccc ewe neces 54 


Addition, subtraction, multiplication, and division are called the four basic arithmetical operations. 
Just as repeated addition of the same summand leads to a new arithmetical operation, namely 
multiplication, so repeated multiplication by the same factor leads to a new arithmetical operation: 
raising to a power or exponentiation. Like addition or multiplication, this operation can be inverted, 
but this time one obtains two distinct inverse operations: extracting roots and taking logarithms. 


2.1. Calculations with powers and roots 


Historical remarks. Powers were already known in antiquity, by their applications in geometrical 
calculations, or by their occurrence in quadratic and higher degree equations. The Babylonians had 
tables of squares and powers. They knew how to solve problems of compound interest by means of 
powers of 2. In the ‘Elements’ of Euc ip of Alexandria (4th century B.C.) one finds the formula for 
(a + b)?, an astonishing achievement at that time. The concept of a power can be traced back to 
the Greek mathematician HIpPpocraTEs of Chios (5th century B. C.). Subsequently it was used more 
frequently, for example by PLato (427-347 B. C.). Originally only the second power was intended. 
BoMBELLI of Bologna (16th century) is believed to have been the first to use the word potenza 
(latin potentia, power, ability, faculty). He also used it to denote the square of the unknown; the 
present-day general meaning of the notion of power is of a later date. Our notation for powers 
essentially goes back to René DEscaRTES (1596-1650). But he used it only for integral exponents 
greater than 2. He still wrote the square of a number a as a: a. Powers with fractional exponents 
have also been known for a considerable time. Some theorems on calculations with fractional 
powers can already be found in the writings of Nicole ORESME (1323-1382). 

Like the powers, so the roots were known in antiquity. Thus, the Babylonians had tables of rational 
square roots. The irrational square roots were calculated approximately by the method of the arith- 
metic-geometric mean. The formula used was //(a* + 5) a+ b/(2a). The Greeks knew that 
the square roots of the numbers 2, 3, ..., 17 other than 4, 9, and 16, are irrational. The proofs of 
the irrationality of these roots are attributed to Hippasos of Metapontum (c. 450B.C.) or to 
THEODOROS of Cyrene (c. 430 B.C.). In the ‘Elements’ of Euciip arithmetical operations of 
the second order are applied to roots. 

In the Middle Ages calculations with roots were further developed steadily. As early as the 9th 
century the Hindu knew that the solution of a quadratic equation and the square root of a number 


48 2. Higher arithmetical operations 


are two-valued and also that the square root of a negative number cannot be real. They also could 
calculate square roots and cube roots approximately. Michael STIFEL (1487-1567) wrote about the 
numerical extraction of up to the seventh root. He extended the theory of irrationalities of the 


form (a + Vb) to expressions of the form (a+ Vb). Gradually the root sign acquired its present- 
day form (derived from the letter r for radix, root), whereas Christoff RUDOLFF (16th century) 


2 
used the following symbols: / for /, VV) for ) etc. It was also recognized that roots can be re- 
presented as powers (which were then familiar) with fractional exponents. 


Powers 


The concept of a power. It happens frequently that equal quantities have to be added: 3.7 + 3.7 
+ 3.7 + 3.7 + 3.7. This sum of equal terms can be written as the product 5 - 3.7. Multiplication 
of equal quantities occurs just as often. Here, too, an abbreviated notation was introduced; for 
example, in geometry the area of a square of side a is calculated as side a times side a or A = a-a 
or briefly A = a? (read: second power of a, or a upper 2, or a squared). Correspondingly one 
obtains for the volume of a cube of side a: V = a-+a-:a=a?>. Generally, for positive integers x: 


aq°-a--a=a" (read: nth power of a, or a to the mth, or a upper n). 
n factors a; n > 0, integral 


Here a is the basis and n is the exponent of the power. Hence the mth power of a number is an 
abbreviated expression for a product of m numbers equal to the given one; in this sense one sets 
a’ = a, Example: 2° = 2:2-2-2-2 = 32. One also says that 2 is to be raised to the fifth power, 
and the operation is called exponentiation; it is a repeated multiplication by the same quantity. 
Since exponentiation is built up on the second kind arithmetical operation of multiplication, it is 
called a higher operation, of the third kind. 

Since 0 - 0 = 0, one has generally: 0” = 0 for n > 0. Similarly, 1 is reproduced by exponentiation: 
1” = 1 for n > 0. In exponentiation the basis and the exponent cannot be interchanged, as a rule; 
for example, 23 = 8 + 3? = 9, in fact, a’ = b* holds, of course, for a = b, but when a + 5, only 
for 2+ = 47 = 16. 

One distinguishes between even and odd powers, according as the exponent is even (divisible 
by 2) or odd. Thus, 6*, c1®, and generally a2" are even powers, whereas 6’, c/3, and generally a?"~+ 
are odd powers. 


Examples: Powers occur in many formulae and laws of mathematics, science, and technology; 
for example, in geometry 42r3/3 represents the volume of a sphere of radius r, (s?/4) 3 the area 
of an equilateral triangle of side s; in physics gt?/2 is the distance-time law of the free fall, and 
in the calculus of compound interest 5 - (r” — 1)/(r — 1) is the formula for an annuity. 


Of particular importance are the powers of 10. They are used (in rough estimates or slide rule 
calculations etc.) to obtain an idea of the order of magnitude of a number or to write very /arge 
or very small numbers in an abbreviated and perspicuous form. 100 = 10-10 = 10?, 
1000 = 10- 10- 10 = 103, a million = 10°, etc.; for instance, 1291000 can be written 1.291 - 10° 
or 1291 - 10°. Also units of measurement are represented in the power notation, such as m? (square 
meter), cm? (cubic centimeter), m/s? (meter per second squared) etc. 

Powers whose base lies between 0 and 1 decrease when the exponent increases: 

(1/2)? > (1/2) > (1/2)* ..., but increase when the basis is greater than 1: 27 < 23 < 2%... 
They grow very rapidly; the following problem is in the oldest arithmetic book, named after 
AHMES (1700 B. C.): 


Each of 7 persons owns 7 cats, every cat eats 7 mice, every mouse eats 7 ears of barley, every 
ear of barley could yield 7 measures. How many measures is this? Solution: this is 7° or 16807 
measures. 


Sign of powers. Since negative numbers can be multiplied, the basis of a power may be zegative; 
by the standard sign rules one obtains, for example, (—3)* = (—3) - (—3) : (—3): (—3) = +81 
or (—5)3 = (—5) - (—5) - (—5) = —125. It is immediately evident that the product of two negative 
factors is positive, that of three negative, that of four positive, and so on alternately. If the number 
of minus signs is even, the power has a positive value, if the number is odd, a negative value. The 
exponent indicates the number of (equal) factors. 


A power with negative basis has a positive value for an even exponent and a negative value for an 
odd exponent. 


2.1. Calculations with powers and roots 49 


To make the essence of this rule quite clear one chooses the basis (—1). Together with the obvious 
fact that for a positive basis the power is positive one obtains for every positive integer 7: 


Multiplication and division of powers. Powers whose basis and exponent are distinct cannot be 
contracted on multiplication or division, for example, a*c3/x’. 

Powers with equal exponents. If one raises a product to a power, for example (ab)", one obtains 
n factors a-b, altogether 2” factors, namely nm factors a and n factors 5 alternately. Since the 
factors may be interchanged (commutative law), the product can be rearranged as a product of n 
factors a and n factors 5b. 


Examples: 1. (2xyz)5 = 25x%y*z* = 32x%y*z°. [ces on = aro 
2. (3a)? = 3a-3a-3a = 3-3-3+a-:a-a= Pa = 27a’. 
3, 28. §7 = 2-27-57 = 2+(2: $5)’ =2:> 107 = 20 000 000. 


te ir popers: Sproene: Mi rsieed 10 8 pene Bee aa ae eee 
and multiplying the powers so obtained. Conversely, powers with the same exponent are multiplied 
by raising the product of the bases to the power given by the common exponent. 


Similarly, a power (a/b)” whose basis is a fraction is obtained by multiplying n equal factors 
a/b, hence is a fraction whose numerator consists of n factors a und whose denominator of n factors 5, 


that is, a"/b". | 


gies od (z -%'(3) - ae 
"345 34-344 34 a) | ae A 2D gee oe See 

A fraction (quotient) is raised to a power by raising the numerator (dividend) and denominator 
(divisor) individually to the same power and dividing the powers so obtained. Conversely, powers 
with the same exponent are divided by dividing their bases and raising the gt so obtained to 
the power given by the common exponent. 


Powers with equal basis. By the definition of the power the multiplication of two powers a™ and a” 
with the same basis a means that m factors a are to be combined with another 1 factors a; one then 
has m + n factors, that is, the (m + n)th power. 


Examples: 1, 3* - 3? = (3-3-3-3)° (3-3) =3-3°3-3°3-3 ==. [ ens at= an 
: 23-7-+a5b+ 2-77. a'b5 + 2-7-+a*b° 


2. 56a°b - 98a"b* - 14a7b> = 
_ 23+i+t ‘ Jite+ Igitt+2pi+s+3 — ta3 a T*a 14,9 


Second law for powers: Powers with the same basis are multiplied by raising the basis to the power 
given by the sum of the exponents, 

Division. Since the result of every division can be regarded as a fraction in which the dividend 
is the numerator and the divisor the denominator, one obtains on dividing the power a” by the 
power a" a fraction with m factors a in the numerator and n factors a in the denominator. If 7 is 
the smaller exponent, then after m cancellations the denominator becomes 1, and the numerator 
has n factors a fewer, that is, only m — n factors, hence the value a”-". On the other hand, if m is 
the smaller exponent, then the numerator becomes 1 after cancellation, and there are 2 — m factors 
left in the denominator; one obtains 1/a""™. If the two exponents are equal, then both numerator 
and denominator become | after cancellation, and the division leads to the value 1 for every basis. 


5 a 
Fee. & (=) “6 U6 6 bH6c6 > 6. oe. 
a a 


Se \4, oe 
2. (=) = 


Examples: 1. T°: 7* = 


i 
N-H- i I I ] 
3.4915 — er TT = Gi rT wees GT 
4.11° 311 = —— -fi-11.. + «11373 11? 121 ° 
“ae ae | 


50 2. Higher arithmetical operations 


Compared with the result for the multiplication of two powers with equal basis the result obtained 
for division is unsatisfactory: there the product had the sum of the exponents, here one of the dif- 
ferences m — n or n — m occurs as the exponent for the quotient, or even the number 1, which at 
first sight has nothing to do with powers. Since division is the inverse operation of multiplication, 
one should expect that the result is determined in every case by the difference m — n of the exponent 
m of the numerator and 7 of the denominator; this would lead to the third law for powers. 


Third law for powers: Powers with equal basis are divided by raising the basis to the exponent 
given by the difference of the exponents. 


According to the so-called principle of permanence, which was formulated in 1867 by 
HANKEL, one tries to retain the validity of calculating rules, but to extend the concepts 
of the mathematical objects connected by them. The difference m — n of the exponents, which 
occurs in the third law for powers, has a meaning in the first instance for m > n only. If, in accor- 
dance with the principle of permanence, this law is to remain valid also for m = n and for m <n, 
then the exponent 0 or negative exponents occur, which have no meaning under the definition adopted 
hitherto: ‘a” means n equal factors a’. Therefore one extends the notion of power by the following 
two Sanone 


Then a™:q@" =a™" Malware holds without exception and in agreement with the previous results. 
For one has 
1. for m > n the original definition; 
2. for m = n, however, a™-" = a®° = 1; and 
3. for m <n according to the new definition a"—-" = q~“"-™ = 1/a"™™™, 


Examples: 1. a? : a = a? = a~? = 1/a*. 
2.25- (=) ” - (2)? + s-3-(<)" as 52-3 6 gh ges xO = S-h gs x = XMS, 


3. 27a*b* - 56a7b-> - 42a~*b? 
= 3° .a*h* + 7+ 23 «a27b-3 + 7-3+2-a-7b? = 24+ 34+ 72a*h* = (2? + 3? - 7a7b?)?. 
4. What af yall FP (1 kWh = 3.6- 10'* gem? s~?) corresponding to a mass defect 
ee E=m-* G10 2h9 10-2108 ee 
2-10-° -3- 10'°)? 


a oe 10-* 107° 


Saghoate if 2 me oft substance are teanstormed completely into energy, tis Seite liberated 
is 50000 kWh. 


Powers with negative exponents are frequently used for units of measurement and for the basis 10; 
for example, ms~’ = m/s for the velocity in meters per second, gem™* = g/cm? for the density 
in gram per cubic centimeter etc. One uses powers of 10 with negative exponents because they give 
a better picture of very small numbers, such as the elementary electric charge e = 1.602- 107-19 C 
or the diameter of hydrogen atom d = 1.06: 10-® cm. To give an approximate idea in the latter 
case of the size of the diameter of the atom, it can be stated that the diameter of a hydrogen atom 
is to that of a football roughly as that of the football to that of the earth. 


Exponentiating a power. To exponentiate a”, that is, to calculate (@”)", means according to the 
original definition: to form a product of n equal factors a™ each of which consists of m equal factors a. 
Hence altogether m - n factors a are to be multiplied. This argument also holds for m equal factors 
1 re orn ol factors 1/a", so that the integers m and n can also be negative. 


Fourth law for powers. Powers are exponentiated by raising the basis to 
the exponent given by the product of the exponents. 


Since the order of the factors can be changed, so can that of the exponents. Therefore the exponent 
of a power may nd oneal and the order of the factors is immaterial: a”°" = (a™)" = (a")™. 
Examples: 1. = 16? = 256 
3 (- 9a7b>)5 i“ ri 1 Btaip)s "410gtopts m 3Sq7h!! 
* “(6a"b)* ~~ (—1)* (2 3a2b)* ——é«i EST CUCCC~S 
J: AE ai teger that can be written by means of three digits, using addition, multiplication, 
and exponentiation tion only, is 9°), For 9 + 9 4+-9< 9-9-9 < 999 < 999 < (99)? < 999 < 919" 
87420489 To write this number down inthe decimal one needs a 


system one a strip of paper stretching 
nearly: fron’ Losidesa € ShockhoWe: or one could fill 33 books of 800 pages each with 14 000 digits 
per page. . 


2.1. Calculations with powers and roots 51 


Tables for squares and cubes 


Looking up squares. In a table of squares one finds in the column headed 0 the squares of the 
numbers from 1.0 to 9.9 (which stand in the leftmost entry column); for example, 6.47 = 40.96. 
The subsequent columns headed 1, 2, ..., 9 contain the squares of all three-digit numbers rounded 
off to four valid figures (Fig.), namely 
the squares of numbers ending in 1 in 
the column marked 1, those ending in 
2 in the column marked 2 etc. Thus, the 
square of 6.44 stands at the intersection 
of the row 6.4 with the column 4: 
6.447 = 41.47, rounded off to four figu- 
res, whereas the true value is 41.4736. 
This gives at the same time the squares 
of the numbers 64.4, 644, 0.644, 0.0644 
etc. with the same accuracy, because a 
rough estimate shows that, say, 64.4? 
must lie between 607 = 3600 and 70? 
= 4900, hence must be 4147 to four 
significant figures. Of course, today 
instead usually a pocket calculator or 
some type of computer is used. Never- 
theless it is sometimes helpful to be 
able to use tables, e.g. in case one 
likes to get many significant figures. 
If the basis whose square is required has four digits, then the difference d of the squares of the neigh- 
bouring three-digit numbers (table difference) is divided evenly among the ten intervals between 
the possible fourth digits 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10. For example, since according to the table 
6.4472 lies between 6.4407 = 41.47 and 6.4507 = 41.60, the difference of 0.13 (or 13 units in the 4th 
place) is divided among 10 intervals of which each is allotted 0.13 : 10 = 0.013 (or 1.3 units); hence 
in 6.4472 the required 7 units in the fourth place of the basis correspond to 7- 0.013 = 0.091 (or 
7-1.3 = 9.1 units). Generally, for the difference d and z units in the fourth place of the basis the 
share is c = d- z/10 units. The correction c of the table value rounded off to four significant figures 
is 0.09; so one obtains 6.4477 = 41.47 + 0.09 = 41.56. Because of the even distribution of the 
table difference one speaks of linear interpolation. Whereas the curve of the squares is a parabola, 
one assumes a straight line, a chord of the parabola, between the points corresponding to the squares 
6.442 and 6.45?. The lesser the difference between neighbouring table values, the smaller is the 
discrepancy between the value found by interpolation and the true value. For 7.607 one obtains 
in succession 7.607 = 57.76, 7.617 = 57.91, d= 15, c=d-z/10 = 15- 7/10 = 10.5 10, hence 
7.6072 = 57.86. Tables of cubes are arranged similarly (Fig.). 


2.1-1 Looking up the square 6.442 = 41.47 


CUBES 


.225 1.260 1.29! 
602 1.643 1.68: : 
8 2.097 2.14) 


04 
571 2.628 2.686 2.1-3 Square of twice a 
77 3.242 3.308 given area 


B70 3.984 4.020 
5 B40 5.735 
a ie 56751) = =4.1-2 Lookin 
.1- g up the cube 
62 7-88!) 1578 = 3.870 


The concept of a root. The Greeks were already familiar with the question of finding the length 
of the side of a square whose area is known. It is easy to state the length x of the side when the 
area x? is a square such as 4 m?, 9 m?, 16 m? etc.; from x? = 3? m? or x3 = 0.5? m? it follows, 


411 4.492 4. 

"000 5.088 5.178 | $268 5.359 5.4 
6.029 6.128] 6.230 6.332 6.4 
7.078 7.189 | 7.301 7.415 7.5 


Roots 


52 2. Higher arithmetical operations 


of course, that x; = 3 m or x2 = 0.5 m. For the general case, when the area is an arbitrary positive 
(real) number, no general solution was then known; in PLaATo’s dialogue ‘Menon’, SOCRATES 
explains by lengthy geometric discussions that the diagonal of a square of side 1 is itself the side 
of a square of area 2 (Fig.). Today Menon would summarize the geometric content of the dialogue 
in the statement that the side x of a square of area 2 has the value x = //2. Here the symbol )/2 
(read: square root of 2) denotes the number x which when multiplied by itself (squared) gives the 
value 2. The problem of actually finding the numerical value of this number x can be solved only 
in exceptional cases, for example, /9 = 3, because 3? = 9, or /0.0144 = 0.12, because 0.127=0.0144. 


The square root x a of a non-negative real number a is defined as the non-negative number x 
whose square is a; x* = a, 
Similarly, the Delian problem of the Greek mathematicians: to find the side of a cube whose 
volume is twice that of a cube with edges of length 1, leads to a third or cube root. Today the problem 


3 
would be expressed as follows: the edge e of a cube of volume 2 is the number e = //2 (read: cube 
root of 2) whose third power has the value 2, e? = 2. To find this number is easy only in exceptional 


3 3 
cases, for example, /8 = 2, because 2? = 8, or //0.125 = 0.5, because 0.5° = 0.125. Just as the 
3 
statements x = //a and x? = a, x > 0, or e = Ja and e? = a are equivalent, so for b > 0 


a= Vb and a"=b, a>0, 
are defined to be equivalent. 


Definition: The nth root a = |b of a non-negative real number b is that non-negative real number a 
whose nth power a" has the value b; a" = b. 
The extraction of roots is to be regarded as an inversion of exponentiation. The number b from 
which the root is to be extracted is called the radicand and corresponds to the power; the value a 
corresponds to the basis of the power, and the term exponent is also used here for n. 


Since 1" = 1 for every positive integer n, one has v1 = 1; also from 0” = 0 it follows that 
n 
VO = 0. For the sake of completeness one sets //a = a. The equation x? = 4 has two solutions, 


n 
Xx; = +2 and x, = —2, because x? = x2 = 4. The root Vb = x is uniquely determined. For even 
n the two solutions of the equation x" = b must therefore differ by their sign. 
The equation x* = —8 has the solution x = —2. Since in the definition the radicand is assumed 


3 3 
to be non-negative, one has to set x = —/(—(—8)) = — 8. For odd n and 6 < 0 a solution of 


n 3 
the equation x" = b is x = —y(—b). Frequently the notation x = /—8 is used for the solution 
of x3 = —8, and then it is tacitly agreed that for odd 7 the root of a negative number is the negative 
root of its absolute value. As long as one stays within the domain of positive numbers, extraction 
of roots and exponentiation are operations inverse to each other. 


Calculation of roots. In applications square and cube roots are of frequent occurrence. They are 
the only ones to be considered in what follows. Of the various available methods one chooses in 
a given problem the one that yields for the smallest calculating effort a result of sufficient accuracy. 

Numerical methods. As early as the 16th century STIFEL indicated numerical methods for the 
extraction of roots up to the seventh. Today one uses logarithms. It is therefore sufficient to explain 
the method for square roots. 

First some remarks on the number of digits. The square of a two digit number like 21 or 85 has 
3 or 4 digits. Generally, the square of an n digit number has 2n — 1 or 2n digits. Since extraction 
of roots is inverse to exponentiation, the square root of a number with 2n — 1 or 2n digits has n 
digits. Examples are /441 = 21 and 7225 = 85. It is easy to determine the number of digits of 
a root: starting from the decimal point divide the radicand in both directions into groups of two 
digits. Then the number of digits of the root before and after the decimal point is equal to the 
number of groups before and after the decimal point. In the example //39|90|06.98|89 = 631.67 
there are 3 places before and 2 after the decimal point. 

Considering 441, one knows that the root must have two digits, that is, must be of the form 
a-+b, where a is a multiple of 10. Consequently, 441 = (a+ b)? = a* + 2ab+ b? or 
441 = a? + (2a + b)b. This is utilized in calculating the square root, by subtracting from the 
radicand first a? and then the second term (2a + b) b: 

V441 = 20+ 1=21 
—aq* —400 ab a+b 
4l 
—(2a+ b)b — 4l 
0 


2.1. Calculations with powers and roots 53 


A root with more than two digits is computed analogously: 


¥ 37 45.64=— 70+ 5+ 0.8 = 75.8 


a? — 49 00 e 6 ec a+b+e 
8 45 
(2 a b) b 7 25 
‘1 20.64 
—(2@+ 26+ c)c — 1 20.64 
0 


Approximate formulae. Here approximation methods are mentioned that were known and used 
in antiquity. Some others are discussed in Chapter 21. 

If a is very large compared with 5b, then in the expressions (a+ b/(2a))? = a? + b+ b?/(4a) and 
(a + b/(3a))? = a? + b+ b?/(3a3) + 53/(27a*°) the terms with powers of a in the denominator 
are small and negligible. This leads to approximate values for the square and cube root. 


Examples: 1, V¥35 = y(36 — 1) 6— 1/12 = 5.917, the exact value being 5.91608... 
3 4 
2. 730000 = (729000 + 1000) = 90 + 1000/(3 -907) = 90.041, the exact value being 90,0411... 


Extraction of roots by other means. The simplest way is to use a pocket calculator or some other 
type of computer. Also a slide rule or logarithmic table often has been used. Extraction of roots 
by logarithms instead of a slide rule has the advantage that roots with an arbitrary exponent can 
be calculated quickly and without much work. 

To extract roots graphically one can utilize the graph of the power function y" = x. When a 
sufficiently accurate drawing is available or the required accuracy is fairly low, then points lying 


113 


ii jo" — 


2.1-4 Nomogram for the computation of the dia- 2.1-5 Square root of a product 
meter d = /[4V/(xl)] of cylindrical bodies d, in cm, c = |/(ab) 
if V in cm? and / in cm 


54 2. Higher arithmetical operations 


between those used in the construction of the curve can serve to determine for a given positive 


value x of the ordinate the appropriate value y = Vx of the abscissa as the value of the mth root. 
Nomograms are also used very frequently. For example, one can represent the connection between 
the height /, the diameter d, and the volume V of a cylinder in a nomogram (Fig. 2.1-4). 


Example: What is the diameter of a cylinder 20 cm long and of volume 160 cm*? - 

On the volume scale one marks the point 160 cm? and on the length scale the point 20 cm. One 
joins the points by a straight line which cuts the middle scale at the required diameter. In this 
case the diameter is 3.2 cm. Similarly: a cylinder of volume 900 cm? and diameter 8 cm has a length 
of 18 cm. 

The geometric mean can also be represented graphically and then serves as a nomogram for the 
square root of a product (Fig. 

2.1-5). : 

Extraction of roots by means of SOU ARES 
tables. A table of squares contains 
the squares, rounded off to four 
significant figures. By the defini- 
tion of the square root they can 
be regarded as radicands. The 
sequence of digits of the root ofa 
radicand results from the digits at 
the beginning of the row of the 
radicand and the digit at the head 
of the column in which the radi- 
cand stands; for example, 76.39 
= 8.74, 70.1136 = 0.337, 2777 
= 52.7, or 2.403 = 1.55 (Fig.). 

If the radicand lies between two 
squares of the table, then apart 
from the table difference d the 
correction c is given, and from the 
formula c= d-z/10 for looking 
up the squares one now finds the 
fourth significant figure z= 
10-c/d; for example, 56.90 lies 
between //56.85 and 1/57.00 and therefore has a value between 7.54 and 7.55; the table difference 
isd = 15, the correction c = 5, and the fourth digit z = 50/15 = 3: the result is //56.90 = 7.543 
Similarly one finds 15.78 = 3.972 or (6666 = 81.64. 

In a table of square roots the reading and interpolation proceed just as in a table of squares. 
It should be observed that now the radicand stands in a row of the entry column and the correspond- 
ing square root in the middle of the table. 

Example: 55.3 = 7.437. 


4610 os | a 482 
2.1-6 Looking up the square root 2.403 = 1.55 


Everything said about square roots holds analogously for cube roots. 
Apart from the exceptional cases when the radicand is a square or an mth power, all square roots 
or mth roots are irrational numbers (see Chapter 3.). 


Roots as powers with fractional exponents 


The extraction of roots can be regarded as inversion of exponentiation. Now a power is taken 
to the mth power, where n is a positive integer, by multiplying its exponent by n. Since division is 
the inverse of multiplication, the following statement is in accordance with the principle of per- 
manence: 


The ath root of a power is extracted by dividing the exponent of the power by a. 


This defines a new number a™/", which must satisfy a"/" = ya™ when a is a positive real number 
and m and n are two positive integers. By the previously derived rules the mth power of this number 
is (a™/")" — a™, in agreement with the definition of the nth root. At the same time, this representation 


n 
is unique: from m/n = m'/n’, for example, 4/6 = 6/9, it follows that Va" = Ya™’. For if one raises 
n 
the root ya" to the (m’-n)th power and bears in mind that mn’ = m’n, one obtains (j/a™)™" 


= (ya")"}"" = [a] = a’, Similarly (Va"’yr" = [(Ya")"" = [a J" = a"™, that is, the 
same value. This leads to a number of relations. 


2.1. Calculations with powers and roots 55 


Examples: 1. ¥(24x*) = V(23+3+x3- x)= 723+ Vx? + (3x) = 2x Vx). 


"0 a be 1 I ve 
2 ar ~ V(a-5)-Vo-F + ae 7a Oe 


6 2-3 
5.yI= 3? = V3. 6. ya =al?, 7, 


. = (14a)-3/7, 
y/(14a)* 


8. V(18a6) - V(12ab2) - Y(16ab) = (2+ 3?- <3 Batt) 
== (2°: 2-3? - a*b*) = 27-3! +g! +p! - - V(2ab) = 12ab V(2ab). 


13 ae BE 13 
sai x 7 , 2 


at 64: at* 8a? 
10. 2 a5) eat | = Vea = = 
Vi 3 5) ES 25p7q?_ Spq" 
127b7c? « 5a 12¢ - 5a 
Id, 12be _— dal = V(30ac). 
12. (16)? = (16)? = y(24)3 = ‘ait = 212/4 _ 23g. 
13. (V(n2v)]? ae [(n?p)*/6]9 = n2 » O/6, —_ mn: prtt/2 a mer 7 pil2 = n'v yr. 
ae n+] __ =e. 4 A, 4 
14. (a- ) = nate, Ng Pee 
V(@+a4 |/ a" — | a" — | e a"— |" 
for a = 3, n = 2 this means V (3 + (3/8)] = 3 y(3/8) and for 
| 5 
a= 2,n = 5 it means j/[2 + (2/31)] = 2 y(2/31). 
15. viv32) _ [(25)t/2p 4/5 = 75/10 _ 91/2 — y2. 


16. Vlas . Va’) = [a5(a5)'/4]1/3 — 5/3. g5/12 — g5/3+5/12 — g25/12 — g24+1/12 — “Ya. 


17. vB yQ V3)] = = [3-(3- 34/2) i/2)1/2 — 31/2. 41/4 31/8 — 31/24+1/4+1/8 _ 37/8 — ty. 
18. ye: V ‘ap = = g*/4: g3/? — gi/3-3/2 — g-3/6 — yas. 


19. yas c Va = — inne aah: —— Sigg -(S/Gie —_ == g'97/39) x — Hite *=— g*- "Va7®, 


Loo 
20. a) Vy (/10)}} = 'Y (10) = 10 = 10 = 10°-°, b) "10 = 19°-0001 
Rationalizing a denominator. Roots of integers or rational numbers are, in general, irrational 
numbers and can be represented by infinite non-periodic decimal fractions. One therefore tries to 


avoid division by roots, that is, fractions whose denominator is a root of a rational number. In 
such cases one can always find a number by which the fraction can be extended so as to make the 


denominator rational. If the denominator is yam =a™/", with a rational, then on multiplication 
by a™:qm/" — oo 1/n) — gm("-1)/n it assumes the ni a™ and becomes rational. For example, 


the fraction 1 ip must be extended by pe IMS = a 


3 2/3 49 


A denominator " the form Va - — Vb becomes rational on multiplication by Va + Vb, because 
the result is a — b. 


-+tr 


eae - y3 eu = 
Example: (W3— v2 3+ V6. 


56 2. Higher arithmetical operations 


Powers with irrational exponents. Starting-out from the definition of the power as a product of 
an integral number of equal factors, its meaning has been extended in accordance with the principle 
of permanence first to negative and then to arbitrary rational exponents. It is plausible to go a step 
further and to admit irrational exponents. Let « be such an irrational positive exponent. The problem 
is to define b* for a given base b. Now« can be represented as an infinite non-periodic decimal fraction, 
that is, in the form: « = a. a,a2q3 ... a, ..., which means 


«=a-+a,/10 + a,/10? + a3/10% + --- + a,/10' +---. 


Here a is a whole number, the a; are digits between 0 and 9, a, indicates the number of tenths, a, 
that of hundredths etc. The digits of the decimal fraction are not all zero from a certain place on- 
wards, nor do they repeat themselves regularly in a period. For example, for a = 2 = 1.414213 56... 
one has a = 1, a, = 4, az = 1, a3 = 4, ag = 2, ag = 1, ag = 3, a7 = 5, ag = 6... If one breaks 
off the sum for « after the term a,/10!, one obtains an approximation «, which differs from the true 
value « by less than 1/10!. Now it can be proved that as «, comes closer and closer to the irrational 


number «, so for any positive basis b the numbers b*‘, which are powers with a rational exponent 
come closer and closer to a certain number, which is then defined to be 5. 

For a negative exponent « the same arguments apply to the denominator of the fraction 1/b-* 
whose exponent is positive. It can be shown that with these definitions the power laws are valid 
for arbitrary real exponents. 


2.2. Calculations with logarithms 


150 years ago there were poets who regarded the logarithmic table as the very essence of mathe- 
matics. ‘What the logarithms are in relation to mathematics, that is mathematics in relation to the 
sciences’ (NOVALIS). Nowadays the logarithmic table has long since been dethroned in this sense 
and its place in numerical calculations has been taken by computers and pocket calculators (which 
however sometimes use logarithms in their hardware realizations). 


Logarithmic laws and logarithmic systems 


Multiplication by means of powers. If one compiles a list of the exponents /, the powers p, and their 
values n for a basis, say 2, one can easily carry out multiplication and division of the values n by 
means of their exponents /: for example, 4 - 8 = 2? - 23 = 22+3 = 32, or 16: 64 = 2+: 2© = 2>7= 1/4. 
According to the power laws, if one uses the table on the left, one only has to calculate 2 + 3 = 5 
or 4— 6= —2. Instead of exponentiating it is sufficient to multiply: 4° = (27)? = 2° = 64; 
similarly, extraction of roots can be replaced by division, for exam- 


4 
ple /(1/16) = 2-4/4 = 1/2. All arithmetical operations are reduced to 
the next lower kind if instead of the powers occurring in a problem 
one calculates with the corresponding exponents. A disadvantage of 
the method is that for numbers lying between powers of 2 the appro- 
priate exponents are not known. But once one has calculated, for 


00 
example, //2 = 2°°! = 1.006956, the powers of this number for all 
exponents between 0.01 and 1 yield the required values; for example, 
20-02 — (29-01)2 — 1.013960 or 2°? = (2°-91)19 = 1.071773. Inter- 
mediate values between the other powers can also be found. For 
example, 21-01 — 21+0-01 — 2.79.01 — 2.013912 or 23:1 = 23+0-1 
= 8- 1.071773 = 8.574184. All these powers of 2 with exponents 
that are rational but not integral are irrational numbers; 
it is possible to find exponents « for which the power 2% is an arbitrary given number. 

It is easy to show that for a basis b > 1 and an arbitrary positive number x there must be an 
exponent | such that the power b' = x. For sufficiently large positive exponents the powers of 
b> 1 exceed any given real number x > 1, and for sufficiently large negative exponents they are 
smaller than any given real number x with 0 < x < 1. Hence there is an integer a such that 
b* < x < b*+". If one divides the interval from a to a + 1 into ten equal parts of length 1/10, then 
one can find an integer a, between 0 and 9 such that b9*41/19 < x < pet+e:/10+1/10 TF one continues to 
divide the interval between the last two exponents into ten equal parts, one finds for the smaller 
exponent a decimal fraction « = a .a,d2 --- a, ++» =a-+ a,/10 + a,/10? + --- + a,/10' + ---. This 
may terminate, in which case x = bei, where «, = a .a,@2 ... a; iS rational, or it comes arbitrarily 
near a real number « for which b* = x. Numerically these exponents / = « can be determined by 
a series expansion. They are called the logarithms to the basis b of the number x, and are written 
/ = log, x. By the definition the basis b must be greater than 1 and the number x positive. All the 
logarithms to a fixed basis b form the logarithmic system with basis b. 


2.2. Calculations with logarithms 57 


The actual logarithms considered at the beginning were to the basis 2. The previous results can 
now be written as follows: 
log, 2 B.for2™ = 2; log, 4 2,.for2™ = 4; 
log, 32 = Si, for2™ = 32; log, (1/16)—=—#@, for 2 ™ = 1/2* = (1/16) ; 
log, 1.006956 0.01 ,for2%°! = 1,006 956 ; 
log, 1.071773 = O01, for2 1.071 773 ; 
log, 8.574184 =—3., for2™™ = 8.574184 ; 
log. (4°8) = log, 4+ log, 8= 243=5 log, 32 ; 
log, (16:64) = log, 16 — log, 64=—= 4—6= —2 log, 1/4; 
log, 4° =3:- log, 4 3:2=6 log, 64; 


4 
l ] l ] m = i 
log. V6 , < log, 77 = qo =-—l = log, 7 


The underlying relations hold for any basis b, because they represent nothing but the power 
laws stated for exponents. The sequence of powers ..., b~3, b-?, 6-1, 1, 61, b?, b>, ... gives rise to 
log, b? = 2, log, b? = 3, ..., log, b” = n, log, b = 1, log, 1 = 0, log, (1/b) = —1, log, (1/b") = —n 
that is, the logarithm of 1 is always 0, and further values are as in the table on the right. Because 
of their frequent application the rules for calculation with logarithms are stated as separate laws. 


| logarithms 


basis | 
| between 


First law for logarithms: The logarithm of a product is 
equal to the sum of the logarithms of the factors. 


For from / = log, (7, + 12), 1, = log, m,, /, = log, nz it fol- 
lows that b! = n, -n2, b's = n,, b's = ny or 5b! = bhit!s, that 
is, /=1, +1). 


Second law for logarithms: The logarithm of a quotient 


Pow s- is equal to the difference of the logarithms of the dividend 
and the divisor. 


For from / = log, (,/n2), 1, = log, m,, [2 = log, n2 it follows that b' = n,/n,, bi=n,, bs =n, 
or 5! = bii-'s, that is, / = 1, — 1,. 


Example: log, 1/17 = logs 1 — log, 17 = —log, 17. 
| _ Third law for logarithms: The logarithm of a power is equal to the 
Paweecaese 2 logarithm of the basis multiplied by the exponent. 


For Poti l= log, rr, , /, = log, p it follows that b' = p’, b4:= p or b! = (641)? = Bi, that is, 
[= rl; . 


= 3 log, 5 +- 2log, x — 4 log, 6. 2. log, 6b’ = rlog,b=r-l=—r. 


32 
Examples: 1. log, t 


sets beaitec Soda ibad Ta Maleibioate hee Fees epee 
of the radicand divided by the exponent. 


For from / = log, yw, 1, = log, w it follows that 5b! = y™, bi = w or 6! = wilt = 6O/Nh, that 
is, l= (1/r) - l,. 


4 
Example: VS = '/; (log, q* — log, s?) = */3 log, gq — 7/3 logy s. 


Logarithmic systems. Among all possible logarithmic systems (basis b > 1) only two are in 
common usage: the natural and the decadic logarithms. In higher mathematics the ones used almost 
exclusively are the natural logarithms. They are based on the transcendental number e, which is defined 


58 2. Higher arithmetical operations 


by the limit lim (1 + 1/n)" or by the infinite series: 
R—» OO 


| : 1 \" 1 1 1 
Ceres | ee 


The powers of e with variable exponents form the exponential function e*, which is suitable for 
the description of all events whose increase or decrease is proportional to the quantity present at 
any given moment, such as radioactive decay or the growth of a forest or a population. In fact, 
the first logarithms calculated by mathematicians of the 16th and 17th century belong to this system. 
Instead of log, they are frequently denoted by In: log, x = In x. The series used to calculate logarithms 
yields the values of the natural logarithms. The decadic or common logarithms have the basis 10 
and were first calculated by Briccs. In practical computations they are used almost ex- 
clusively and are then denoted just by lg instead of log;9. If the reader comes across the 
symbol log x with no basis specified, he may assume that in ele- 
mentary mathematics the intended basis is 10, in higher mathema- 
tics it is e. The notation lg x = log, x, used in this book, is recom- 
mended internationally and is gradually coming into use. The advan- 
tage of the common logarithms, which lies in the fact that their 


logarithm 


basis is the same as that of the number system, is clear from their 1/10° = 10-° —3 
integral values (see the table). This means that the logarithms to the 1/100 = 10°? —2 
basis 10 need only be calculated for the numbers between 1 and 10, 1/10 = 10-* —] 
or that for the calculation of logarithms only the sequence of digits | 0 
of a number matters. For example, having found that lg 2.37 10 l 
= 0.3747 one has at the same time the common logarithms of the 100 fs 
numbers 23.7, 2 370, 0.237, 0.002 37 etc., that is, of every number 1000 3 
that is a product of 2.37 and a power of 10 (see table). oat 

Logarithms derived from lg 2.37 = 0.3747 

number conversion logarithm charac- 

teristic 


= 10- 2.37 Ig 10 + Ig 2.37 lg 23.7 = 1.3747 
2370 = 10° + 2.37 | Ig 10° + lg 2.37 lg@2370 = 3.3747 3 
0.237 = 1/10- 2.37 | Ig 1/10 + lg 2.37 lg0.237 =0.3747 — | —| 
0.002 37 = 1/10° + 2.37} Ig 1/10? +- lg 2.37 lg 0.002 37 = 0.3747 — 3 —3 


The actual digits 3747 to be calculated for a logarithm are called its mantissa, and the integer 
before the decimal point its characteristic. This characteristic has the value 0 for a number greater 
than or equal to 1 and less than 10; the value 1 for numbers from 10 up to but excluding 100; and 
generally, the value n — 1 for numbers with n digits before the decimal point. If the number is a 
decimal fraction less than 1, then the characteristic is negative and indicates the number of places 
by which the decimal point has to be shifted to the right so as to stand behind the first non-zero 
digit of the number. 

For logarithms to a basis b one need only calculate the values for numbers between 1 and b as 
mantissa; for numbers between b and b? the characteristic is 1, etc. For the decadic logarithms, 
whose basis is the same as that of the number system, the characteristic can be read off immediately. 

Transition from one logarithmic system to another. The fact that the simplest expansions in series 
give natural logarithms, but that the common logarithms are needed for practical work makes it 
necessary to calculate the logarithm of a number 7 in a basis b from that of the same number in 
another basis a. Suppose that /, = log, m is known and that /, = log, 7 is required. Let /, = log,b 
be the logarithm of the basis b of the new system referred to the basis a of the known system. In 
power notation the three equations are 


ada=n; blo=n; ala=b. 
Raising the third to the /,th power one obtains 
ala!» = bto = n= ala, thatis, ala'b —a'a or T,°],=|,. 


ae dha Ssomcnsiashitasee [Chain rale | lous 6 tonsa = toa] 


If the natural logarithms are taken as known, one has to set a = e und b = 10 and obtains 

In 10-lgm=I1nzn. The common logarithms therefore arise from the natural logarithms after 
multiplication by the constant 1/ln 10 = Myo, which is called the modulus of the logarithms to 
the basis 10. 


2.2. Calculations with logarithms 59 


ae 


pe 1 | IM = Ls hey Scie ee eee 


If, on n the other hand, natural isearitinisa are to = calculated from a given stable of ¢ common ey ee 
then one has to set in the chain rule a = 10, b = e and obtains lge- Inn = Ign. 

The fact that lge = M,9 can be seen immediately by setting n = 10: from Inn = lgn/lIge one 
has In 10 = I/Ige or lge = 1/In 10 = Myo. 


Inverse operations. Addition and multiplication have one inverse operation each, namely sub- 
traction and division, for from s; + s2 = s it follows that s; = s — sz or s2 = s — s,. Similarly, 
from fi ‘/2 = p it follows that f; = p/f. or f2 = p/f,. But if one wishes to calculate from a given 
power r? = p the basis r or the exponent q, then two distinct operations come into play: the basis 


q 
arises as root r = |/p, the exponent as logarithm q = \og, p. By substituting in the power r? = p 


q 
the formal inversions one obtains either (//p)’ = p, that is, the definition of the root, or the 
equation r!°2r? = p, that is, the definition of the logarithm. The root as inverse to exponentiation, 
however, assumes a rational exponent: when q = t/s, where t and s are coprime integers, then 
t 


r'/S immediately leads to r' = p‘, r = )/pS. But if q has the irrational value «, then the root can only 
be interpreted as a power with a real exponent: r = p'/*. To calculate the power p or the root r 
for an irrational exponent « one utilizes logarithms. From p = r* one obtains by taking logarithms 
on both se to the basis 10, say, lg p = « lg rand then p = 10*'8', or from lg r = (lg p)/« the value 
r=10 SP)/x 

Logarithms of numbers other than powers of the basis 5 with rational exponents of the logarithmic 
system are irrational. For example, if log 2 = t/s were a rational number, with t and s coprime 
integers and s > ¢, then 10/5 = 2 or 10° = 25 or after cancelling 5‘ = 25-', in contradiction to the 
theorem on the unique factorization of integers into prime factors. The argument may be generalized 
to any basis 6 and any integer n, which may be assumed to lie between 1 and b. The contradiction 
in the equation 5‘ = n‘ then results from the fact that b on account of b > n must have a divisor 
other than 7. 


Application of logarithms. About the invention of logarithms LAPLACE said: ‘The invention of 
logarithms shortens calculations extending over months to just a few days and thereby, as it were, 
doubles the life-span of the calculators.’ 

But the significance of logarithms is not exhausted by the immense simplification of computing. 
The concept of a logarithm is a working tool in many branches of higher mathematics, for example, 
in the differential and integral calculus, differential equations, in complex analysis, potential theory, 
and analytic number theory. 

In thermodynamics the entropy S of a body or a system of bodies is directly proportional to the 
natural logarithm of the thermodynamic probability W, that is, S = k - In W, where k is the Planck- 
Boltzmann constant (kK = 1.380 - 10—/° erg/dg). 

In astronomy the magnitude (brightness) m of a star is measured not by the energy J meeting 
the eye, but by its logarithm. Here m — mp = —2.5 lg (//Ip), where Ip is the radiation energy for 
the magnitude mg and / that for m. 


Example: If the absolute magnitude of the sun is Mj = +4.7 and that of the star Rigel in the 
constellation Orion is M = —5.8, one obtains —5.8 — 4.7 = —2.5 Ig (//Ig), 10.5/2.5 = 4.20 

le (//Io) or J/g & 16000, so that the star Rigel radiates per second about 16 000 times as much 
energy as the sun. 


The law can be regarded as a special case of the Weber-Fechner law, which states that the amount 
of a perception is proportional to the natural logarithm of the stimulus, in other words, not dif- 
ferences but quotients of stimuli are perceived as equal. 

In the barometric height formula h — ho = 60 370(lg bp — lg 6), Ag and bo are the known height 
in feet and barometer reading in inches mercury of a place, and b is the barometer reading at another 
place whose height A is required. 


Example: What is the height above ground of an aeroplane for which the pressure of the sur- 
rounding air is measured as 17.60 in. Hg, while a station on the ground reports bp = 22.15 in. Hg?- 
Since lg bh = 1.3454 and lg b = 1.2455, one obtains approximately 4 = 60 370° 0.1 ft. = 6037 ft. 
as the height of the aeroplane above ground. 

In biology the compound interest formula 5, = b(1 + p/ 100)" can be used, for example, to cal- 


culate the number of years needed for a certain increase in the volume of wood of a forest. Here 
6 and b, are the volumes at the beginning and the end of the period in question, and p is the annual 


60 2. Higher arithmetical operations 


rate of growth in percent. The formula holds generally for organic growth with a given annual 
percentage rate. For example, in the investigation of radioactive substances it was found that of a 
nuclei present at a given instant An nuclei disintegrate, where A is a number between 0 and 1. Hence 
the differential equation dn/dt = —An holds, which can be integrated by separation of the variables: 
dnin = —A dt or In (n/no) = —At, n = no e~**, where the integration constant m9 is the number of 
nuclei present at the time t = 0. The time 7 at which half the nuclei have disintegrated is called the 
half-life; it is given by In (1/2) = —AT or T = (In 2)/A. 


Practical logarithmic calculations 


In numerical calculations the decadic or common logarithms have been of main importance. It is 
sufficient to discuss this system, because of the simple relations for the transitions from the 
logarithms of one system to those of any other. It was shown above that logarithmic tables need 
only contain the values for numbers between 1 and 10. All other numbers can be represented in 
the decimal system as products of one of these numbers and of a power of 10. The exponent of 
this power is an integer, which for numbers greater than 10 is positive, and for numbers less than 
1 is negative; it is called the characteristic and is 0 for numbers between 1 and 10, whose logarithms 


are decimal fractions between 0 and 1. The sequence of their digits after the decimal point is called 
the mantissa. 


Logarithmic tables. According to the number of digits to which the irrational values of the log- 
arithms are rounded off one speaks of 4-, or 5-, or 7-place (or -figure) logarithms. Logarithms with 
more figures are rarely used and only for special purposes. The leftmost or entry column contains 
the first 2, or 3, or 4 digits of the number, that is, of the digits of numbers from 10 to 99 (4-figure). 
from 100 to 999 (5-figure), or from 1000 to 9999 (7-figure). Behind every number of the entry 
column there are the 10 mantissae of the numbers whose third, fourth, or fifth digits have the ap- 
propriate value between 0 and 9. In 5-figure tables many mantissae would have the same first two 
digits, for example 97 for the values lg 9.333 = 0.97002, ..., lg 9.340 = 0.97035, ..., lg 9.549 
= 0.97996, altogether 217 mantissae! To avoid these unnecessary and unwieldy repetitions one 
places these first two common digits in a separate column before the column of the fourth digits headed 
0, and only once at the beginning of the row in which a// the mantissae start with these two digits; 
in the example (Fig.) to the right of 934. The parts 002 up to 030 of the preceding mantissae stand 


930 848 853 858 876 881 886 890 
931 895 900 904 923 926 932 937 
932 942 946 951 970 974 979 984 
933 988 993 997 *016 *021 *025 *030 
934 | 97035 039 044 063 067 072 077 
935 081 086 090 109 114 «+118 = «123 
936 128 132 137 155 160 165 169 
937 174 179 183 202 206 211 216 | 


2.2-1 The rows 930 to 937 of the entry column of a 5-figure logarithmic table 
MBERS 


LOGARITHMS OF NU 


2.2-3 Graphical interpolation for the mantissa; 
2.2-2. Looking up the mantissa of lg 5.728 = 0.7580 Ig 5.728 = 0.7580 


2.2. Calculations with logarithms 61 


therefore under the pair 96, but belong to 97. They are therefore marked with an asterisk: *002 
up to *030 (Fig.) or by heavy type or in other manners. Similarly in 7-figure tables the first 3 digits 
of the mantissae are extracted. 

The last digits z of the numbers are accounted for, as usual, by linear interpolation (see Squares). 
The table difference d of the neighbouring mantissae is divided into 10 equal parts d/10 whose 
z-fold z- d/10 = c gives the correction of the mantissa. For example, lg 5.728 lies between 0.7574 
and 0.7582 (Fig.), the table difference d is 0.7582 — 0.7574 = 0.0008, the next digit z of the number 
is 8, hence the correction to the mantissa is c = z: d/10 = 8 - 8/10 = 6.4 6 units of the fourth 
place after the decimal point: lg 5.728 = 0.7574 + 0.0006 = 0.7580 (Fig.). In a 4-figure table 
lg 5.7283 has the same value, but in a 5-figure table it lies between lg 5.728 = 0.75 800 and lg 5.729 
= 0.75 808, hence for d = 8, z = 3 and c = 0.8:3 = 2.4 it has the value lg 5.7283 = 0.75 802. 
The same result is obtained for lg 5.728342, the digits 4 and 2 cannot be taken into account when 
working with a 5-figure table. But in a 7-figure table (Fig.) one obtains lg 5.7283 = 0.7580258 and 


28 | 7580030 0106 0182 0258 0333 0409 0485 0561 0637 0712 
29 0788 0864 0940 1016 1091 1167 1243 1319 1395 1470 
30 1546 1622 1698 1774 1849 1925 2001 2077 2153 2228 


5731 2304 2380 2456 2531 2607 2683 2759 2835 2910 2986 
82 8062 3138 3213 3289 3365 3441 3516 3592 3668 3744 


2.2-4 The rows 5728 to 5732 of the entry column of a 7-figure logarithmic table 


Ig 5.7284 = 0.758 0333; for the table difference d = 75 the table of proportional parts gives for 
the digit 4 the correction c, = 30, for 0.2 the correction co.2 = 1.5, hence for 42 the correction 
c = 31.5 = 32, and so lg 5.728 342 = 0.7580290. 

But if a logarithm is given and the appropriate number is required, then apart from the table 
difference d the correction c is known, and the next valid digit z is obtained from z = 10: c/d. 


Example J: |\gn, = 0.5412 lies between 0.5403 and 0.5416, hence 249 


A 15 n, between 3.47 and 3.48; here d= 13,c = 9, hence z = 90/13 = 7; 1| 24.9 
21150 consequently mn, = 3.477. >| 498 
3 925 Example 2: |gnz = 0.50000 lies between lg 3.162 = 0.49996 and 53 14.7 
4 | 4n. lg 3.163 = 0.50010; here d= 14,c=4 and z= 40/14 3, hence i 

4 | 30.0 Na = 3.1623 4| 99.6 
Sage Example 3: \g ny = 0.240 9357 lies between Ig 1.7415 = 02409235 3| 124-3 
71525 and lg 1.7416 = 0.240 9484. The table of proportional parts of d= 249 711743 
8 60.0 for ¢ = 122 yields the sixth digit 4, because 4- 24.9 = 99.6, and since 8 199.2 
9 1675 ]22 — 99.6 = 22.4, the seventh digit has the value 9, so that my 9| 224] 


= 1.741549, 


Negative characteristics are often indicated by placing a bar (for the minus sign) on top, and not 
in front, of the number before the decimal point. For example, from lg 2 = 0.3010 one obtains 
lg 1/2 = lg 1 — lg 2 = —0.3010; this number can be written as 0.6990 — 1 or 1.6990, and the same 
result is obtained by starting from lg 5 = 0.6990, hence lg 0.5 = 0.6990 — 1 = 1.6990. Another 
method, which is sometimes used in tables of logarithms of trigonometric functions, avoids negative 
characteristics by imagining a characteristic of —10 placed behind the logarithm. In this notation 
lg 0.5 = 0.6990 — 1 = 9.6990 — 10, and the table then only gives the entry 9.6990. 


Example 1: \g 0.723 = 0.8591 — 1 = 1.8591 = 9.8591 — 10, written as 9.8591. 
Example 2: \g 0.00723 = 0.8591 — 3 = 3.8591 = 7.8591 — 10, written as 7.8591. 


Example 3: To compute the value of mn = an a 4-figure table gives lg 2 = 0.3010 and 


lg 1.03 = 0.0128. To find the quotient n = mite one takes logarithms: lg n = 1g0.3010 — 1g0.0128 


= (0.4786 — 1) — (0.1072 — 2) = 0.3714 + 1 = 1.3714._ The number for this logarithm is 
n = 23.52. In the bar notation the calculation is: lg n = 1.4786 — 2.1072 = 1.3714. 

Example 4: To compute the value of In 2. — The formula for the transition from the common 
to the natural logarithms is: 
In 2 = (Ig 2)/(Ig e) = 0.3010/M,9 = 0.3010/0.4343. By taking logarithms one obtains: |g(In 2) 
= lg 0.3010 — lg 0.4343 = 1.4786 — 1.6378 = 1.8408. The number for this logarithm is In 2 = 0.6931. 


62 2. Higher arithmetical operations 


Graph of the function y = lg x. The tables give the function values y for every positive real num- 
ber x. By plotting these values in a Cartesian coordinate system one obtains the curve of the func- 
tion y = lg x (Fig.). It meets the x-axis in the point x9 = 1, yp = Ig 1 = 0. For decreasing x-values 
from 1 to 0 the curve falls rapidly and for x > 0 it approaches asymptotically the negative y-axis. 


2 eae ee 


i . ¥ . 7a t ” = = - . 7 i tt +: 
| , | i 
emeee Sad | 
REUSE BEOUG BSSES FSSEE TSERS PRERI CSUR UBSEE FOES! SSees USER Poses ESE | 
Leet pr I | / j — 


wot 


| 2.2-5 Graph of the 
| 1 - function y = Igx 


For increasing x-values from 1 to 10 the function grows monotonically from 0 to 1. Since 
dy/dx = 1/(x In 10) = M,o/x = 0.4343/x, the angle « between a tangent to the curve and the 
x-axis is small, for tan « = dy/dx decreases from 0.4343 for x = 1 to 0.043 43 for x = 10. Since 
the characteristic of the numbers from 10 to 100 is 1, the function grows in this interval of ten times 
the length by exactly the same amount as between 1 and 10, namely by 4y = 1. The same applies 
for the interval from x = 1000 to x = 10000. One sees by how little the curve differs from a straight 
line, which justifies linear interpolation in this case. The values given directly lie for a4-figure 
table between 100 and 1000 and for a 7-figure table even between 10000 and 100 000. 


Worked examples. The numerical expressions which follow are built up in various ways. If they 
contain only higher arithmetical operations, they can be calculated logarithmically throughout, 
that is, the only number to be looked up from its logarithm is the final result. But if they contain 
sums or differences, then intermediate calculations are necessary to determine the individual sum- 
mands and then their sums. Here it can happen that there occurs the logarithm of a number that 
is to be subtracted. Now logarithms of negative numbers do not exist in the real field. Therefore 
one often uses the device, especially in trigonometric calculations, of indicating by the symbol 7 
(negative) or p (positive) the operation to be applied to a number. The usual sign rules for multiplica- 
tion have to be modified accordingly: (++) - (—) = (—) now becomes p + n = n, and (—): (—) = (4+) 
becomes n + n = p etc. 

The characteristic and the mantissa are determined in different ways, but only when taken together, 
do they give the value of the logarithm. The operations to be performed on the logarithm are applied 
equally to the digits of the characteristic and of the mantissa. For example, if the logarithm of a 
number x has the value 5.6, lg x = 5.6, then half the logarithm has the value 2.8 so that ?/, lg x 
= Ig Vx = 2.8, and similarly 3 lg x = lg x? = 16.8. 

Care must be taken in applying these obvious rules to logarithms of positive real numbers less 
than 1. These logarithms have negative values, and the notation differs somewhat from the common 
usage. On the numerical line, instead of going leftward from zero directly to the required number 
One goes in zig-zag fashion beyond it to a negative integer and then again to the right up to the- 
number (Fig.); for example, —0.35 = —1 + 0.65 or —1.62 = —2+ 0.38. This procedure is 
motivated by the occurrence of powers of 10 in the decimal expansion of the number in question; 
it leads to various representations of one and the same number, for example, —0.35 = —2 + 1.65 
= —10 + 9.65; —1.62 = —5 + 3.38 = —20-+ 18.38 etc. The mantissae in these examples are 
65 and 38; the characteristics consist of two integers: one positive or zero, and the other, usually 
in second place, negative. In the examples above they are (0.... —1) = (1.... —2) = Q. ... —10) 
and (0. ... —2) = G3. ... —5) = (18. ... —20). Multiplication, division, and exponentiation of num- 
bers, that is, addition, subtraction, and multiplication of their logarithms, presents no difficulties 
if one bears in mind that in looking up a number one starts out from the next negative integer 
below it. In extracting roots, that is, dividing logarithms, one has to arrange that the characteristic 


2.2-6 Logarithms with 
negative characteristics 


2.2. Calculations with logarithms 63 


is divisible by the exponent of the root so that after division it is again an integer. This can always 
be done. 

Similar remarks apply to the other notation for negative values of logarithms: instead of 0. ... —k 
one writes k. ... and performs the arithmetical operations in the usual way, bearing in mind that k 
stands for —k. 


3 — 
Example 1: x = \/100, Example 6: x [56.07] : [992.6] 
lg x = 1/3, lg 100 = '/, - 2 = ?/, = 0.6667, = fa : 
x = 4642. lgx = lga — Igb 


Example 2: x = 0.2, 
lg x = '/; lg 0.2 = */,(0.3010 — 1) 
= '/,(1.3010 — 2) = 0.6505 — 1 
or lg x = 20. 3010 — 10) 
1/, (19.3010 — 20) 


= 9.6505 — 10, Example 7: x = 
-@ : 


Ig x = lga— lgb 


x = 0.4472. 


3 

Example 3: x = (1 — (0.927)°) 
lg 0.927 = 1.9671, 

5 lg 0.927 = 1.8355, 

(0.927) = 0.6847 

1 — 0.6847 = 0.3153, 

lg 0.3153 = 1.4987, 

1/, lg 0.3153 = I. 8329, 


0.5134 +1 
15134 


7 Example 8: x = |0.07440) * 
x = y0.3153 = 0.6807. sai eek vias 
erm Sign % 


N 


Example 4: x = (160.5 ~ [0.3856 - [01006938 


- fq- ; id. 0.8716 —2 5 
lg x = Iga + Igb + Ige. a’ 4.3580 —10 | 
numbers NV are separated from the 
logarithms lg by a vertical line: Example 9: x = x 
| is C = 
lg x =naliga 


Ig (lg x) = Ig + Ig (Ig a) 


Example 5: x = | a ; Example 10; x = y [0:09 024 
; = 1/5 
lg x = Iga — lgb va = 4 
“ Ig x = "/slga 
0.075 35 


64 2. Higher arithmetical operations 
| \/ [89.49]>-5 - [0.006 006] |- it a 
Example 1]: x = —<—=— ——- 
0.000 010 01] 2 - [3 601000 ‘Td? fal+ 
Igx = ‘/e(3.5 le J +'/.lg BM —2'e fe — 41e ()) 


x = 0.0 
N lg operation 


89.49 ————»+ 1.9517 38 6.8310 

0.006 006 1.7786 —4 0.8893 —2 

0.000 010 01—*+ 0.0004 —5—-- 2—— 
3 601 000 + 6.5564 


a a A Cea 


0.017 74 + 0.2490 —2-+-: 6 ] 4939 —12 


Example 12: x = - 


1.1558 
| 0.5310 


- —e 2.4592n 


1.9068 
0.9872 —1 = 
ea eee —2 


x — 354.8 + 2.5499n (1.5499 +1)n 


Historical remarks. The origin of logarithmic calculations shows clearly the connections between 
the development of society and that of mathematics. The discovery of the sea route to India is 
closely linked with a flourishing period in astronomy, navigation, and trigonometry. Mathematics 
was an indispensible tool of the navigators. With the spreading of trade the commercial methods 
of calculation grew in importance, in this context mainly the calculus of compound interest. In both 
these spheres the demands made on professional calculators were extraordinary for the time; one 
should visualize the amount of computations which the astronomer Johannes KEPLER (1571-1630) 
needed to derive the laws named after him. The leading calculators were searching for simplifying 
methods, in particular, for a link between arithmetic and geometric numerical sequences which 


would replace multiplication by addition. 


2.2. Calculations with logarithms 65 


The mathematicians Paul WitticH (1555-1587) and Christoph CLavius (1537-1612) proposed 
in their book ‘de Astrolabio’, which appeared in 1593, to reduce the multiplication of two positive 
numbers a and 6 less than 1 to an addition, by regarding them as the values of trigonometric func- 
tions, a = sin« and b = cos Pf. By the addition theorems (see Chapter 10.): 


sin (« + £) = sin« cos B + sin B cos « 
sin (« — B) = sina cos B — sin B cosa 


1/2 [sin (« + B) + sin (« — £)] = sin« cosB = a-b. 


For a = 0.61566 and 6 = 0.93969 they obtained « = 38° and f = 20°, hence « + 8B = 58° and 
a — B = 18°, consequently a- b = 1/2[0.84805 + 0.30902] = 1/2 - 1.15707 and 0.61566 - 0.93969 
= 0.578 54. 

The mathematician Simon STEVIN (1548-1620), who at one time was Quartermaster General 
of the Dutch army, advocated the Hindu-Arabic positional notation, in particular, the decimal 
notation for fractions. He worked on tables for the calculation of compound interest, which were 
continued by Jost BURGI (1552-1632) and published by him in Prague in 1620 as ‘ Arithmetische 
und geometrische Progresstabuln’. He started out from the basis 1.0001 = 1 + 1/10000, the powers 
of which are easy to compute because —- in modern language — even a few terms of the binomial 
expansion give the required accuracy of 10 places postulated by BUtrci. The 10000th power 
(1 + 1/10000)1°°°° of his basis is 2.71814, close to the number e = 2.7182818..., which is defined 
as the limit lim (1 + 1/n)". The exponents divided by 10000 are therefore approximately equal to 


n-» OO 
the natural logarithms of the powers. The tables contain in the entry column the logarithms, and in 
place of the mantissae of present-day tables there are the numbers themselves (antilogarithm tables). 

Even the distinguished mathematician Jon NAPIER (or NEPER), eighth laird of Merchiston 
(1550-1617), only partially achieved his aim in his work ‘ Mirifici logarithmorum canonis descriptio’ , 
published in 1614. The function he chose was, in modern notation, y = ge~*/9 with g = 1000000. 
For the sum x = x; + x2 of two exponents this leads to y = ge~*1/9 - e~*2/9 = y, y2/g. Together 
with Henry BricGs (1556-1630), Professor of mathematics in London, who admired NapirEr, he 
decided in favour of the function y = 10*. After NAPIER’s death BriIGGs continued the calculations; 
his ‘ Arithmetica logarithmica’, which was published in 1624, contains the 14-figure logarithms 
of the numbers from 1 to 20000 and from 90 000 to 100 000. The missing logarithms were calculated 
by the surveyors Ezechiel DE DECKER and Adrian VLACQ, whose first complete logarithmic table 
appeared in 1627. 

To convert a table of antilogarithms into one of logarithms, BrIGGs made use of the relationship 
between an arithmetic progression and its associated geometric progression in which to the arith- 
metic mean (a, + a2)/2 of two numbers a, and a, there corresponds the geometric mean //(g; g2) 
of the corresponding quantities g, and g2. For example, in the sequence of powers to the basis 3 

to the number 2.5 = (2 + 3)/2 there corresponds the value //(9- 27) 


3 5 7 a “" =9 3, because 3?+9”/? — (3? - 3°) = ¥(9- 27). 


Choosing the basis 10 and denoting the terms of the arithmetic progression as logarithms by /; and 
the values of the powers by 7;, one obtains in succession for the intervals 0< J; < 1 and 1< n,;< 10, 
respectively: 


,=1/,0+ 1) =0.5 n, = 10 = 3.162277... 
I, = 4/20, + 1) = 0.75 nz = V(ny- 10) = 5.623 413... 
I; = 4/2(, + 12) = 0.625 nz = V(n, - nz) = 4.216 964... 
ly, = #/2(l, + 13) = 0.6875 Ng = V(nz +13) = 4.869 674 ... 
Is = */o(lz + 14) = 0.718 75 Ns = V(nz +14) = 5.232991 ... 
Ie = */2(lg + Is) = 0.703 125 Ng = V(ng- ns) = 5.048 065 ... 
ly = 1/2 (lg + Ig) = 0.695 3125 nz = V(ng* 1g) = 4.958 067 ... 
lg = */2(l6 + 17) = 0.699 218 75 ng = V(ng * nz) = 5.002 865 ... 


The eight steps listed yield lg 5.002865 = 0.69921875, and the method shows how by a repeated 
extraction of square roots the value of lg 5 can finally be computed with an arbitrary accuracy. 


The logarithmic slide rule 


Historical remarks. As early as the second decade of the 17th century Edmund Gunter (1561-1626) 
indicated the principle of logarithmic calculations along a graduated straight line. On his scales 
multiplication and division were performed as addition and subtraction of lengths by means of a 
pair of dividers. Shortly afterwards William OUGHTRED (1574-1660) used two of Gunter’s lines 
sliding along each other. This made the use of dividers unnecessary. His lines were made both in 


66 2. Higher arithmetical operations 


straight and in circular form. Around the middle of the 17th century Edmund WINGATE (1593-1656) 
and Seth PARTRIDGE used a rule sliding between parts of a fixed stock, that is, an instrument similar 
to the present slide rule. It acquired its final shape in the course of the 19th century. Industrial 
mass production of slide rules began towards the end of the 19th century. Slide rules for special 
purposes, for example, for electricians and tradesmen, followed shortly afterwards. 


The structure of a logarithmic slide rule. The following exposition refers to a system of slide rule 
in common use among scientists and engineers (Fig.). Other systems are arranged in slightly dif- 
ferent ways and require some minor modifications in technique. As a rule, the length of the scales 
of a slide rule is 25 cm. 

rule 


slide @a 1 2 & és & . + 


2.2-7 The logarithmic 
slide rule 


cursor or runner 
A slide rule consists of three parts: the rule, the slide, and the cursor or runner. The rule contain s 
the scales marked A, D and K (and often also other scales). The slide moves in it in grooves. The 
front of the slide carries the scales marked C, R or CI, and B, and the back the scales of the tri- 
gonometric functions or others. The cursor with one or three vertical lines moves across rule and 
slide. As a rule, one uses the middle vertical, the others play a role in the calculation of circular areas 
and cylinder volumes. 


0 0.5 f 45 
Po fi 2.2-8 Logarithmic scale below and 
1 Z 3 456789 10 20 30 40 60 100 linear or ordinary scale above 


The scales of a slide rule. The scales of a slide rule are related to the function y = lg x. In these 
logarithmic scales numbers are placed at points whose distance from the origin is proportional 
to their logarithm. The basic logarithmic scales C and D refer to numbers from 1 to 10. The distance 
from the origin for a particular point on the scale is obtained by multiplying the length of the scale 
(250 mm) by the appropriate logarithm. For example, the distance from the initial point 1 to the 
point 2 is lg2°-250 mm = 75.3 mm. The uneven growth of the logarithms implies an uneven 
growth of the logarithmic scale (Fig.). One sees that as the numbers increase, their distances become 
smaller. 

The scales A and B are also logarithmic. They consist of two parts of equal length. The distance 
between the numbers 1 and 10 is half as much as on the scales C and D. Hence a segment on the 
scale D of length lg corresponds on the scale B to a segment of length 2 lg n = lg n?. This means 
that the numbers on the scales A and B are the squares of those on the scales C and D below them. 

Similarly, the scale K, which consists of three parts of equal lengths, gives the cubes of the numbers 
on the scale D. 

The scale L, often at the bottom of the rule or in the middle of the back of the slide, is not logarith- 
mic and gives directly the mantissa of the common logarithm of the number standing above it. 

The scale CI or R contains the re- 
ciprocals of the numbers on the scale 
D. It has the same graduation, but 
in the reverse order from right to left 
and gives for every value x on the 
scale D the reciprocal 1/x. 

Reading and setting. One can read 
"6.25 three digit numbers on a slide rule. 

The first two digits present hardly any 
2.2-9 The various subdivisions of a slide rule difficulties, not more than reading 


2.2. Calculations with logarithms 67 


numbers off a ruler. The third digit is a different matter. The uneven scales necessitate uneven 
divisions. 

If one investigates the scale D, one recognizes three sections: the first section lies between 1.00 
and 2.00. Between two neighbouring two-digit numbers there are 9 verticals; for example, 1.00 
is followed by 1.01, 1.02, ... The subdivision is into ten parts each (Fig.). 

In the second section, between 2 and 4, the division is into five parts each. Between two neigh- 
bouring verticals for two-digit numbers there are 4 further verticals whose distance corresponds 
to two units in the third place. For example, 2.00 is followed by 2.02, 2.04, ... Since the distance 
between numbers becomes smaller and smaller towards the right, in the third section, between 4 
and 10, only a division into two parts is possible, and the distance between the verticals corresponds 
to five units of the third digit. 

Reading and setting proceed with the help of the middle vertical on the cursor. This is placed 
exactly at the number needed for a calculation or just calculated. Further details can be found in 
the subsequent instructions for calculations with the slide rule. If one comes to a setting in between 
two vertical lines, one has to resort to an estimate. 


Calculations with the slide rule. Just as every logarithmic computation involves the determination 
of the characteristic, so every slide rule computation requires a rough estimate or an approximate 
calculation in powers of 10, to determine the number of digits before the decimal point. A rough 
calculation is advisable, because it gives not only the right number of places, but also a certain 
control of the result. Calculations with the slide rule are geometric additions and subtractions of 
segments. By using the scales A, B and K one can double or treble segments, or divide them into 
two or three equal parts. Here are a few 
examples. 

Multiplication. This is based on the law 
le (a: 6) = Ilga-+ lg6. The addition of the 
two logarithms is carried out on the slide rule 
as addition of two segments of length lg a and 
lg b. The initial point | of the slide scale C is 
placed at the point a of the rule scale D. Then 
the cursor is placed at the point 5 of the scale 


7 ig 6 5b 
__ —_ 


C and underneath the product a- 6 is read off ge~ 9° La 
on the scale D (Fig.). | 
Example |; Tocalculate 2- 1.5. One places — | 
the 1 on the scale C above 2 on the scale D, ‘ ig a a:b a 
Under 1.5 on C one reads off the product 4 , SS 
on the scale D (Fig.). 2.2-10 Illustration of multiplication and division 


Example 2: To calculate 
2.84 4.55. Rough estimate: iii , 
3:4= 12. This alone shows that e 10 tN deh! "ls Koa se ‘s “ 
the result can no longer be read off — — 
the scale D. One proceeds as fol- 
lows: on the scale D one sets 2.84. 
Shifting the slide to the left one 
sets the end point 10 of the scale C 
above the point 2.84. One places 
the cursor at the point 4.55 of the ; " 
scale C and underneath one reads at eee nd TT Joe \e Lan 
off on the scale D the value of page UW Td 
the product 2.84 - 4.55 = 12.9. ’ 
An explanation of this method of 
multiplication can be obtained by ‘ 
carrying out the same calculation setting reading 
on the two left sections of the sca- ; 
les A and B. The result then falls 2.2-11 Example of multiplication by slide rule: 2° 1.5 = 3 
into the right section. 
For a multiplication the | or 10 of the slide scale is placed at one of the factors and the result 
is read off at the other factor. 


sheet mo 


c- i a 3 - 


Division. The relevant law is lg (a/b) = lg a — lg 6. On the slide rule this means that the segment 
of length lg 5 has to be subtracted from the segment of length lg a. Above the point a of the scale D 
on the rule one places the point b of the scale C on the slide. Then one reads off the quotient a/b 
on the scale D under the initial point or the end-point 10 of the scale C on the slide. 


68 2. Higher arithmetical operations 


et Na aco! te cad Not aks apt a ga AA 
uuu sil 


Example I: To calculate 88.5: 0.515. 
Rough estimate: 90:0.5 = 180. Above 
the point 88.5 on D one places the point 


0.515 of C. Under the initial point 1 of C 


| ’ | 
Te mo) it Dinrtrriit {7 one finds on D the quotient 172 = 


aan 
aad) ext —«- 88.5: 0.515. 


ary AN a Example 2: To calculate 19.2 : 89. 

| ee A Rough estimate: 20: 90 = 0.2. One pro- 
it m7 ceeds just as before, but the quotient is 
— read off under the end-point 10 of the 


Spero scale C, Solution: 19.2 :89 = 0.216 (Fig.). 


For a division the divisor on the slide 
scale is placed at the dividend on the 
rule scale, and the result is read off 
at | or 10 of the slide scale. 


To calculate expressions of the form (a, - a2 ...)/(b; : bz ...) one divides and multiplies alternately so 
as to have the least possible nubmer of shifts of the slide. In calculating a proportion of the form y 
= a-c/b one first divides a by 5 and then multiplies by c. This requires a single setting of the 
slide, whereas the multiplication a - c followed by division by b necessitates two settings of the slide. 
In calculating y according to the method proposed it can happen that the setting of b on the slide 
falls beyond the division of the rule. Then a second shift of the slide becomes necessary, and 
consequently an additional small inaccuracy. Therefore it is preferable to form such expressions by 
means of the scale of squares. 

Calculations with the scale of reciprocals. The scale R or CI of reciprocals gives the reciprocal 
for every value on the scales C or D; for example, above 4 on C one finds the reciprocal 0.25. This 
scale can be used in multiplications and divisions. It is to be observed that a: b = a: b7!. Instead 
of multiplying one has to divide by the reciprocal. 


Example: To calculate 4.8 - 3.6. Rough estimate: 5-3 = 15. Above 4.8 on D one places 3.6 
on R. Below the initial point 1 of R one reads off on D the result 4.8 - 3.6 = 17.28. 


Squares and roots. AS remarked above, the numbers on the scales A and B are the squares of the 
corresponding numbers on D and C. To find the square of a number a one need only place the 
vertical of the cursor to the number a on D and read off above it the number a? on the scale A 
(Fig.). In extracting a square root one reverses the procedure: one sets the vertical of the cursor 


setting reading 
2.2-12 Division by slide rule: 19.2: 89 = 0.216 


reading 2.2-13 Example of squaring: 4.5? = 20.25 


b/a y¥(1+(b/a)2) \° 


2.2-14 Position of slide rule for the 
calculation of /(1 + (6/a)*) 


Setting 


at the radicand 5 on scale A and reads off Vb below it on the scale D. In extracting roots one has to 
observe the correct setting of b on the scale of squares: 


y25=5 Setting of 25 between 10 and 100. 
250 = 15.81 Setting of 25 on the section between 1 and 10, because 250 = y(2.5- 100) 
=10-y2.5.  - 


Radicands between 1 and 100 are set directly, other radicands are brought to a value between 1 
and 100 after multiplication by a suitable power of 100. 

Cubes and cube roots. On the scale K one finds the cubes of the numbers of the scale D. The 
calculation of cubes and cube roots follows the same pattern as that of squares and square roots. 


3.1. The natural numbers N 69 


Calculation of c = Y(a? + b*). These expressions are of frequent occurrence. It may be assumed 
that b > a. The calculation proceeds according to the rule 


c=ay[i + (b/a)?]. 


One takes the following steps: calculation of b/a by means of scales C and D, reading off (b/a)? 
on the scale of squares, mental addition of 1 and setting of 1 + (b/a)? on the scale of squares, reading 
off }/[1 + (b/a)?] on D, and finally multiplication by a. The figure illustrates the procedure (Fig.). 

Trigonometric, Pythagorean, and exponential scales. As a rule, the back of the slide contains 
further scales: a sine and tangent scale for small angles from 34.5’ to 6° for which the values of the 
sines and tangents are approximately equal: a sine scale from 5° 45’ to 90°; a tangent scale, which 
can be used for calculating tangents and cotangents. This scale goes from left to right for angles 
from 5° 45’ to 45° and in the reverse direction from 45° to 84° 15’. The Pythagorean scale records the 
values of |/(1 — x*) and permits the calculation of the cosine values. Exponential scales require 
further explanations, which would take up too much space here. They are useful for a number of 
specialized computations. Only the basic principles of the slide rule have been described here. Modern 
refinements provide for more sophisticated numerical calculations. 

Accuracy in calculations. The applications of the slide rule are limited only by the accuracy of 
the result that can be achieved. For a scale length of 25 cm the error of a careful reading can be 
assumed to be around 0.1%. For scales of length 12.5 cm the error in reading is doubled. Several 
settings in the course of a calculation increase the mean error according to the Gaussian law of 
propagation of errors. For four settings it amounts to about 0.2%. Even in simple calculations one 
should make several repetitions and take the average of the results obtained. 

Calculating discs. In these instruments the carrier of the logarithmic scale is not a straight line 
but a circle. In calculating wheels these scales are attached to the faces of concentric discs which 
slide against one another. This special form of a slide rule has the advantage that even for an instru- 
ment of small dimensions the lengths of the scales are comparatively large and that shifting of the 
slide can be avoided because all parts of one scale are opposite parts of another. These advantages 
lead to an increased accuracy. 

Since an increase in the lengths of the scales leads to greater accuracy, the scales of calculating 
frames are split into smaller parts and arranged parallel to each other in a plane. In calculating 
cylinders the parts of the scales are arranged along the surface of a cylinder. Calculating cylinders 
have been constructed by means of which one can achieve the same accuracy as in calculations with 
a 5-figure logarithmic table. 

All those traditional calculating means have been overcome by the modern calculating devices, 
the computers. The slide rule almost completely is substituted by the pocket calculator. And for 
particular purposes, for example, for use in electronics, hydraulics, building in concrete, sur- 
veying, navigation, or optics specialized software or even specialized computers are available. 


3. Development of the number system 


3.1. The natural numbersN......... 69 3.4. The integers Z ................ 73 
3.2. Absolute rational numbers. Frac- 3.5. The real numbersR ............ 74 

tIONS.... cee ee ee eee eee 71 3.6. Continued fractions ............ 76 
3.3. The rational numbersQ........ 73 3.7. The complex numbers C ........ 77 


There is no occupation in which a man would not have to perform simple calculations. The 
numbers used in such tasks are a means Of grasping and mastering the world quantitatively. But 
statements of general validity can be based on arithmetical operations only when the laws under- 
lying the calculations are known. The accumulated experience of millionfold applications in the 
historical development of mankind is laid down in computational rules that can be reduced to a 
few logical fundamental concepts and axioms. The foundations for this deductive build-up of the 
number system are the theory of sets and mathematical logic. On this basis first the natural numbers, 
and then the further domains of numbers, are constructed. 


3.1. The natural numbers N 


The natural numbers have their origin in a fusion of the cardinal or counting numbers with the 
ordinal or place numbers. But in what follows, no distinction will be made between cardinal and 
ordinal numbers. 


70 3. Development of the number system 


The Peano axioms. As early as 1891 PEANO showed that all the properties of the natural numbers 
can be derived from five axioms, which now bear his name. 


Peano's 1. 0 is a number. 
axiom 2. Every number a has precisely one successor n’. 
system 3. 0 is not the successor of any number. 


4. Distinct numbers have distinct successors. 
5. If a set of natural numbers contains the number 0 and contains, together with any 
number, also its successor, then it is the set of all natural numbers. 


According to these axioms every natural number other than 0 can be described as a successor, 
or as a successor of a successor of ... of a successor, of 0. But instead of the notation 0, 0’, 0”, ... 
one uses simply the decadic positional system and denotes the numbers by 0, 1, 2, ... In particular, 
Axiom 5 gives the justification for the method of mathematical induction. 


Mathematical induction. This method of reasoning is used to prove that a proposition P(n) con- 
cerning natural numbers 7 is valid for all natural numbers; to give but one example: the equality 
14+24+44...4+ 2" = 2"! — ] holds for all natural numbers. In using mathematical induction 
as a method of proof one considers the set M of all natural numbers for which P(n) is true. If one 
can show 1) that 0 € M, in words, that the proposition is valid for n = 0, (basis of induction), and 2) 
that from 2 € M it follows that n’ € M, in words, that the validity of the proposition for any natural 
number n (inductive hypothesis) implies its validity for the successor n’ (inductive step), then by 
Peano’s fifth axiom M must be the set of all natural numbers. In the example above, P(0) is evidently 
true (1 = 2 — 1), and if the proposition is assumed to be valid for a natural number 7, then it 
follows that 1+2+4+4--.. 427+ 2+) = Qm+1 — 4 4 Qn) — Del] 4 1) — 1 = 2"? — 1, in 
other words, its validity for the natural number 7 + 1. Consequently the proposition is valid for all 
natural numbers. 

Mathematical induction is used not only in proving theorems, as just now, but also in defining 
and constructing mathematical objects. Such definitions and constructions proceed recursively or 
step-by-step, and each step makes use of the previous steps. That this procedure cannot lead to 
logical complications was proved by DEDEKIND in his ‘justification theorem’. 


Calculations with natural numbers N. The addition is defined by m+ 0=0+m=>™m and 
(m+ nn‘) =(m-+ ny)’, the multiplication by m-0 =0:m=0 and m:n’ =m-n-m. By these 
recursive definitions addition and multiplication are uniquely determined. The rules of calculation 
are then derived by mathematical induction. 


The proof, for the associative law of addition, for example, proceeds as follows: The statement is 
true for c = 1 (basis of induction); indeed, (a + b)+1= (a+ b)4+ 0'= [a+ 5)+ OJ’ = (a+ dD)’ 
=a+tb’=a-+(b+ 1). Now assume that the statement is true for c = n (inductive hypothesis): 
(a+ b)+n=a-4 (b+ n). It is required to show that it is then also valid for c= n-+ 1 (inductive 
step). Now (a+ b)tn = [a+ b)4+ nl =at+(b+nyY =a4+(b-+7’). By Peano’s fifth axiom 
all natural numbers then have the property expressed in the associative law. 

For more than two terms the operations are defined by mathematical induction, for example, 
a, +a,4+--- ta, = (ay +a, +--+ + ay,_1) +a, and a, + a3°++:* An = (Ay * AQ ° +++ * An_1)* An. 
In any sum or product brackets can then be inserted or removed arbitrarily, as can again be proved 
by mathematical induction. 


Subtraction and division of two numbers a and b are defined as inverse operations to addition 
and multiplication: If for the given numbers a and b there exists a number x such that a + x = Bb, 
then x = b — ais the difference of b and a; if it exists, it is uniquely determined. If for the given 
numbers a + 0 and 5b there exists a number y such that a- y = b, then y = D/a is the quotient of 
b and a; if it exists, it is also uniquely determined. Of course, as a rule, for given natural numbers a 
and b such an x or y need not exist, as the examples 3 + x = 2 or 3: y = 2 show. 


Subtraction and division cannot always be carried out within the domain of natural numbers. 


The wish to remove these restrictions gives rise to the formation of new number systems. 

Exponentiation. For numbers a + 0 and 7 the power a” is again introduced by a recursive definition: 
a®° = 1 and a” =a"-a. The standard laws of powers (see Chapter 2.) hold for the multipli- 
cation and exponentiation of powers, as well as for the division, provided that the quotient exists. 
The operation of exponentiation is neither commutative nor associative, as the following examples 


show: 3? = 9, but 23 = 8; (32)3 = 93 = 729, but 3(2°) = 3® = 6561. 


3.2. Absolute rational numbers. Fractions 71 


Roots and logarithms. If for given numbers a and n (n + 0, 1) there exists a number 5 such that 


b" = a, then 5 is called the nth root of a and is denoted by b = Va. If for the numbers a + 0 and b 
there exists a number 7 such that a” = 5, then n is called the logarithm of b to the basis a and is 


3 
denoted by m = log, 5; for example, 5 = //125, 6 = log, 64. 


Survey of the arithmetical operations 

First kind: Addition with subtraction as inverse operation, 

second kind: multiplication with division as inverse operation, 

third kind: exponentiation with taking roots and logarithms as inverse operations. 

An order n’ > n between neighbouring numbers is already provided by the successor relation. 
For arbitrary natural numbers a and 5 the relation a > b or b < a is defined to hold if there exists 
a natural number c + 0 such that a= b+ c. It can be shown that the relation so defined is an 
irreflexive order (see Chapter 14.). It satisfies the monotonic law of addition and multiplication as 
well as the Archimedean axiom. 


Archimedean axiom: For arbitrary a > 0 and 4 there always exists a number nm such thata: nm > 6. 


It can be shown that for any two distinct natural numbers a, b either a > b or b > a. Therefore 
the relation > on the domain of natural numbers is a fotal (or linear) and Archimedean order. 
For calculations with natural numbers the usual rules hold: in the following examples it is assumed 
that all the operations occurring can be carried out: 
a+ (b—c)=(at+b)—c=a+b—c=a-—-c+b, (ac): (b:c)=a:b, 
a—(b+c)=(a—b)—c=a-—-b—c=a—c—b, (b—c):a=(b:a)—(c:a), 
a—(b—c=(a—b+c=a-—-btc=aic-—b. 
From the properties of order it can be proved that for a > 1 the powers a" form a sequence of 
steadily increasing numbers and that roots and logarithms, if they exist, are uniquely determined. 


3.2. Absolute rational numbers. Fractions 


The processes of sharing and of measuring require a system of numbers in which one can carry 
out not only the operations that can be performed on natural numbers, but in addition also unrestrict- 
ed division. The postulate that in an extended system the laws of the old domain remain valid, as 
far as possible, is called the principle of permanence. 


Construction of the extended system. One forms ordered pairs of natural numbers m and n + 0 
and writes them like fractions m/n. Two such symbols m/n and p/q are said to be equal, written 
as m/n = p/q, if mg = np, for example, 3/1 = 6/2 because 3 - 2 = 6-1. This equality satisfies the 
conditions for an equivalence relation. All equal fractions can therefore be collected in a class such 
as {3/1, 6/2, 9/3, ..., 150/50, ...}, and every fraction belongs to one class. These classes are called 
absolute rational numbers. Each such rational number can be represented by any one fraction of 
its class; for example, 2/3, 4/6, 30/45 are representatives of one and the same absolute rational 
number « = {2/3, 4/6, 6/9, 8/12, ..., 18/27, ..., 30/45, ...}. The preferred representation is by a 
reduced fraction in which numerator and denominator have no common divisor greater than 1, 
in the last example one also writes o = {2/3}. The curly brackets serve to distinguish between the 
class and its representative. 


Arithmetical operations and order. In determining the arithmetical operations and the order for 
absolute rational numbers one is guided by the standard rules for fractions. The relevant definitions 
are chosen such that the extended number system has the properties that have proved sensible in 
centuries of calculating practice. The operations of the first and second kind for the numbers 
o = {m/n} and B = {p/q} with n + 0 and g + 0 are defined as follows: 

a + B = {(mq + pn)/ng}, for example, {2/3} + {10/7} = {(2-7 + 3- 10)/3 - 7} = {44/21}, 
a — B = {(mq — np)/nq}, provided that mg > np, 
a-B = {(mp)((ng)}, for example {3/5} - {20/9} = {(3 - 20/(5 - 9)} = {60/45} = {4/3}, 
a/B = {(mq)/(np)}, (p + 0), for example, {3/5}/{7/1} = {(3 - 1I)/(5- 7)} = {3/35}. 
The last rule shows that division, except by 0 = {0/q}, can be performed without restriction. Sub- 
traction and division are the inverse operations to addition and multiplication. Subtraction cannot 
always be carried out. Operations on more than two numbers are defined correspondingly. 
: An et is defined by analogy to equality: » > B if mq > np; for example, 8/9 > 11/14 because 
-14>9-11. 


72 3. Development of the number system 


All these definitions are meaningful only if they are independent of the choice of representatives 
for « and f. This will now be demonstrated for the order. 

Assumption: {m/n} > {p/q}, that is, mq > np; let m,/n, and p,/q, be other fractions in the classes 
{m/n} and {p/q}, respectively, that is, m/n = m,/n, and p/q = pi /q1. 

Assertion: Then {7,/m,} > {p;/q1}. 

Proof: Multiplication of mg > np by n,q, shows that mn,gq, > nn, pq,. Since, by assumption, 
mn, = mn and pq; = pig, substitution gives m,ngqq, > nn, p,q. After cancelling the factor nq on 
both sides one obtains m,g, > n,p,, aS required. 


The order relation on the domain of absolute rational numbers is total (linear) and Archimedean. 


This domain can be mapped, with preservation of the order, point by point to a so-called number 
ray. The images lie dense on the ray, that is, between any two of them there is always another, for 
example, between the images of « and £ that of (« + £)/2. 

The commutative, associative, and distributive laws for the operations and the other rules for 
natural numbers are also valid for the absolute rational numbers. Here is a proof for the associative 
law of multiplication: Let « = {m/n}, B = {p/q}, y = {r/s}. Then [a- Bl: y = {(mp)/(nq)} - {r/s} 
= {(mpr)/(ngs)} = {m/n} - {(pr)/(qs)} = - [8 - y). 


Absolute rational and natural numbers. The extended number system contains as part a domain 
corresponding exactly to the natural numbers, namely that of the numbers {1/1}, {2/1}, {3/1}, ..., 
{k/1}, ... For if two numbers of this kind are added or multiplied, another such number arises: 
{k/1} + {I/1} = {(k + D/l} and {k/1} - {1/1} = {k- 1/1}. Furthermore, {k/1} > {l/1} holds if and 
only if {k > /}. If one now assigns to {1/1} the natural number 1, to {2/1} the natural number 2, and 
in general, to {k/1} the natural number k, then it turns out that one calculates with the numbers 
{k/1} just as with the natural numbers. 


In the system of absolute rational numbers the subsystem consisting of the numbers of the form 
{k/1} is isomorphic to the system of natural numbers with respect to the operations and the order 
defined there. 


From now on the numbers {k/1} can simply be written as k and can be treated as natural numbers, 
because they satisfy Peano’s axioms. Also in the other absolute rational numbers the curly 
brackets can be omitted, by writing, for example, m/n instead of {m/n}. Confusion cannot arise as 
long as the rules for calculations with fractions are observed. True, there is a conceptual difference: 
the fractions 3/7 and 6/14 are not identical, because their numerators and denominators differ, 
but their values are equal. However, the absolute rational numbers 3/7 and 6/14 are simply identical. 
This simplification is the motivation for the definitions of the arithmetical operations and the 
order. The definitions are chosen just so that the isomorphism mentioned above holds. It corresponds 
to Hankel’s principle (better: postulate) of permanence to define operations and order in the extended 
system in such a way that the laws which hold in the old system remain in force, as far as possible. 
In contrast to the principle of mathematical induction, which is a valid method of proof, Hankel’s 
postulate only states what is desirable in an extension of a number system. This principle has no 
power of conviction. One can now state: 


The system of absolute rational numbers is an extension of the system of natural numbers. It 
consists of the natural numbers and the fractions. 


Operations of the third kind. The power «8 = y in which B = n is a natural number is defined 
B 
in the obvious way. In accordance with this, for a given value of y the numbers « = Vy and 
B = log, y are defined as quantities for which «6 = y, provided that they exist. But if B is an absolute 
rational number, 6 = r/s, then «8 = y is to be interpreted as y = yor. If such numbers exist in 
the system of absolute rational numbers, then they are uniquely determined by these stipulations. 
Examples: 1. For « = 2/3 and 8 = 3 one obtains y = a8 = (2/3)? = (2/3) - (2/3) - (2/3) = 8/27. 


Pr) 3 | | 

When 7 = 8/27 and § = 3 are given, one has « = jy = (8/27) = 2/3, and for y = 8/27 and 
x = 2/3 one has § = log, 7 = log2,, (8/27) = 3. ah) ae 

2. For « = 9/16 and B = 1/2 one potaine y = ab = (9/16)'/? = (9/16) = 3/4. When y = 3/4 

1/2 | 

and 6 = 1/2 are given, one has «= jy = y(3/4) = 9/16, because (9/16)'/? = 3/4; and for 
y = 3/4and « = 9/16 one has § = log, y = logg,;6 (3/4) = 1/2. ban | 

3. For « = 2 and # = 1/2 one should have y = a? = 2'/? = /2. But this is not a rational 
number, as will be shown later. 


3.3. The rational numbers Q 73 


3.3. The rational numbers Q 


In order to be able to subtract absolute rational numbers without restriction one introduces in 
the set of ordered pairs (see Chapter 14.) (a, B) of absolute rational numbers an equivalence 
relation, the so-called equality of difference: one writes (a, 8B) ~ («’, B’) if « + B’ =a’ + B. For 
example, (12/5, 3/5) ~ (5/2, 7/10) because 12/5 + 7/10 = 5/2 + 3/5. The equivalence class of the 
pair (m, n) is called a rational number and is denoted for the time being by {(m, n)}. For these new 
numbers the arithmetical operations are defined so that they are independent of the representatives 
and that subtraction can always be performed. The proofs of the laws of arithmetic (see Chapter 1.) 
are based on the fact that these laws hold for absolute rational numbers. 


Signed numbers. To remove the clumsy notation {(m, n)} one begins by observing that a number 
pair (m,n) for m > n can be written in the form (n+ k,n), for m<_ n in the form (m, m+ k), 
and for m = nin the form (m, m); for example, (2, 1/2) = (1/2 + 3/2, 1/2), (3/4, 1) = (3/4, 3/4+ 1/4), 
(2/5, 4/10) = (2/5, 2/5). One then introduces a new notation: 


(+k) when m=n-+k (k> 0); forexample, {(2, 1/2)} = (+3/2) 
je when n=m-+k (k> 0); forexample, {(3/4, 1)}} = (—1/4) 
(0) when m=; forexample, {(2/5, 4/10)} = (0). 


The fact that this notation is independent of the representative is a consequence of the equality 
of difference. The plus- and minus-sign are called the sign of the number. It must not be confused 
with the operational symbols of the same shape. The numbers « = (+k) are called positive, the 
numbers 8 = (—k) negative, and k is called the absolute value of « or B, in symbols k = |x|, k = |B]. 
Here k is an absolute rational number. An order for rational numbers is defined as follows: 


when « is positive, B positive or zero, and |x| > |8|, for example, ele > (5/6); 
« > B {when « is positive and P is negative, for example, (+1/100) > (—10 
when « is negative or zero, f is negative, and |x| < |6|; for example, (— 2/3) > (—1). 


This definition is also compatible with the equivalence relation and satisfies the condition of tran- 

sitivity. The rhonotonic laws hold for addition, and also for multiplication by a positive factor. 

But the order relation is reversed when both sides are multiplied by the same negative numbers; 

for example, 

(+5) > (42) implies that (+5) + (—1) > (+2) + (—1) and (+5) - (43/4) > (42): (43/4), 
but (+5) -(—1) < (+2)-(-1). 


The Archimedean axiom, suitably modified, holds just as for absolute rational numbers. 


Rational numbers and absolute rational numbers. The system of rational numbers contains as a 
proper part that of the positive rational numbers. By assigning to the number (-+4) its absolute 
value k this part is mapped one-to-one onto the system of absolute rational numbers. The arith- 
metical operations, as far as they can be performed, and the order are preserved, for example, 


(+4) ° (4) = (+k - 2D. 
The system of positive rational numbers is isomorphic to that of the absolute rational numbers 
with respect to the operations, as far as they can be performed, and to the order. 


Since (+k) + (—J) = (+k) — (4+), the notation can be further simplified by omitting the plus 
sign; for example, (+ 3/4) + (—2/5) = (+3/4) — (42/5) = 3/4 — 2/5. 
The power «f for positive « is defined as on the et oB8 has a meaning only when |a|!4! has a 


meaning in the system of absolute rational numbers; Vo and logg « are only defined for certain 
positive « and # (B + 1) and then correspond to the definitions for absolute rational numbers. 


|x|'F|, when B>O, The system of rational numbers is an extension of that of ab. 

xB= i] when f= 0 solute rational numbers. In it the arithmetical operations of the 

: él > first and second kind can be performed without restriction, but 
1/|o|!Fl, when B< 0. nor those of the third kind. 


3.4. The integers Z 


Instead of constructing on the basis of the natural numbers first the system of absolute rational, 
and then that of rational numbers, as it has been done here following the practice in some schools, 
one could have followed another path (see Chapter 1.). There the domain of integers 0, +1, 
+2, ... iS first constructed from ordered pairs of natural numbers with equal differences, and then 


74 3. Development of the number system 


the domain of rational numbers is obtained by means of pairs of integers with equal quotients. 
The domain of integers contains as part that of the positive integers, which is isomorphic to 
that of the natural numbers. The domain of rational numbers, in its turn, contains as part the 


domain of numbers +4, which is isomorphic to that of the integers. In both cases there are two 
extensions. 


absolute rational numbers 
3/4, 9/7, ... 


integers 
| 0, +1, +2, +3, ... 
rational numbers | 
} 41/3, +17/4, ... 


3.5. The real numbers R 


Like the absolute numbers on the number ray, so the rational numbers are dense on the number 
line. But they do not fill the line without gaps, as the following arguments show. If to every segment 
5S a positive number « is to be assigned as its length, then by means of a sequence of right-angled 
triangles segments 5,, S52, 53, S4, --- can be con- 
structed (Fig.) to which there belong the positive 
numbers «, = 1, «2, «3, %4, -.-, say. By the 
theorem of Pythagoras «2 = 2, «3 = 3, «a2 =4.,... 
One should therefore have «, = V2, «3 = V3, 
&4 = V4, ... These numbers correspond on the 
number line to the points obtained by applying at 
the origin the segments s,, 52, 53, 54, --- But the 
numbers «, and «3 are not rational; for example, 
if «, were a rational number r/s (with r and s 
coprime), then one should have r?/s? = 2. How- 
ever, this is impossible, because if r and s are co- 
prime, then so are r and s?, and the fraction r2/s? 
cannot be cancelled, hence cannot be an integer. 
The proof that «3, «5, etc. are not rational is quite 
the same, but for «4,9, and generally fora,, when | 
n is a square, it leads to a fraction r/s with s = 1, 0 1 2 3 
that is, to an integer. 

Also, to the perimeter of a circle with rational 


diameter d there should correspond, by geometric vi Ve vs V4 VS V6V7 VI 
theorems, the length 2-d, and already Johann G 4 M3 A, as aga, U9 
Heinrich LAMBERT (1728-1777) has proved that 

this is not a rational number. 3.5-1 Construction of the segments 5,, 5,.,... and 


: : i , %,.-. ON lin 
If every segment is to have a numerical mea- Me: POMNS Any Aas neo ON ADE NUMER TINE 


sure as its length, then a new domain of numbers 

is needed, an extension of the domain of rational numbers. This new domain can no longer be 
constructed, as in the previous cases, by number pairs. But hints for its construction are provided 
by a theoretical analysis of the measuring process for segments. To measure a segment accurately 
in terms of a unit segment e, one uses in succession the segments e, e/10, e/100, ... Every subsequent 
measurement contributes a further decimal to the numerical values of the length. A decimal frac- 
tion represents a rational number only when it terminates or is periodic. In general, it neither 
terminates nor is periodic. 


The infinite decimal fractions form the domain of real numbers. 


The domain of real numbers comprises that of the rational numbers. For every terminating 
decimal fraction can also be written as an infinite decimal fraction with the digit 9 from a certain 
place onwards, for example, 7.58 = 7.57999... 


Approximation by rational numbers. Every real number can be approximated with arbitrary 
accuracy by rational numbers. This is utilized in every measurement, which in practice breaks off 
after a definite number of decimal places; for example, if one third of a segment of length 16m is 
to be given in the decimal system within an accuracy of 10~* m, then four decimals are sufficient: 
51/3 m& 5.3333 m. 


3.5. The real numbers R 75 


A decimal fraction is regarded as completely determined if the integer before the decimal point 
is known and a rule by which the successive decimals can be formed. For example, to calculate 
the number «, with «2 = 2, one has the following scheme: 


1 <42< 2, 1? a 2 2°, 

14 <a,<1.5, because 147 <2< 1.5?, 

1.41 <a,< 1.42, 1417 <2< 1.422, 
1.414 << «2, < 1.415, 1.4147< 2< 1.415?, etc. 


Therefore the number «, = /2 lies within the infinitely many intervals with the rational end-points 
(1, 2), (1.4, 1.5), (1.41, 1.42), ..., which are nested within one another. A ‘nest of intervals’ means 
that the left end-points of the intervals increase steadily, the right ones decrease, and the lengths 
of the intervals become arbitrarily small. By such a nest of intervals the infinite decimal fraction 
2 = V2 is uniquely determined. All real numbers can be given correspondingly with arbitrary 
accuracy. 


Order and arithmetical operations. An order for the real numbers can be laid down /exicographically: 
a positive decimal fraction « = a .d,a2q, ... is said to be greater than a positive decimal fraction 
B = b.b,b2b,... if a> b or, in case a = b, if the first decimal a, that differs from 5; is greater 
than b,; for example, 3.78634... >> 3.78629... If one orders the negative numbers by mirror image, 
the laws of transitivity, irreflexivity, and comparability are satisfied, so that one has a relationship 
of total or linear order. The real numbers can be mapped, with preservation of the order, onto the 
points of a line, the number line, which now is filled without gaps. The arithmetical operations of 
the first and second kind for real numbers given by nests of intervals are defined by means of the 
rational end-points of the intervals. A typical example is the division, and the other operations are 
treated similarly. To begin with, suppose that «/8 is to be considered, with « and # positive. Let 
« and B be given by the rational nests of intervals (a;,a;) and (6;, 5;). Since a; << « <a; and 
1/b; < 1/B < 1/b;, one has a,;/b; < «/B < a;/b,;. The intervals {(a;/b; , a;/b;)} form a nest of intervals 
for the quotient «/8: the lower end-points form a bounded increasing and the upper end-points a 
bounded decreasing sequence of rational numbers. 

The length /, of the nth interval is 


, 


a, Qn | | 4nb, — Qnbn | Cay — An) b, + and, — bn)| 


bb b,b; [bnb;| 


The factors 6; and a, in the numerator are bounded. Since the expressions in brackets can be made 
arbitrarily small when n is sufficiently large, the same holds for the entire numerator. Since B + 0, 
there is a positive lower bound for all 5, and 5; with sufficiently large n. The positive real number 
determined by this nest of intervals is denoted by «/f. If « or 8 or both are negative, the arguments 
suitably modified lead to the required result. : 

In order to define the power «*(« > 0) one forms the intervals (a;:, a;°:). They represent again 
a nest of intervals; however, in general, the end-points of the intervals are no longer rational. But 
the following completeness theorem holds: 


An arbitrary nest of intervals (@,, @;) with real end-points determines uniquely a real number @ 
with the property 9, < @ < @g; for all i. 


The proof is omitted here. The nest of intervals {(a,°:, a’i)} therefore determines a real number, 
which is denoted by «8. 
The arithmetical operations so defined satisfy the same laws as in the domain of rational numbers. 
The domain of real numbers is an extension of the domain of rational numbers. It contains, in 
particular, all roots of positive numbers. 


lL, = 


The construction of the real numbers could only be sketched here. It should be mentioned that 
in a formal treatment it is necessary to define an equivalence relation for nests of intervals and to 
form the corresponding classes. This method, carried through rigorously, is a genuine constructive 
way of obtaining the real numbers. 


Other methods of defining the real numbers. Eupoxus (about 408-355 B.C.) can be regarded 
as a harbinger for the development of a theory of the real numbers. His geometrically orientated 
ideas were taken up by Karl Weierstrass (1815-1897) and Richard DEDEKIND (1831-1916) and 
were further developed, utilizing modern arithmetic and analytic methods. The method of nests 
of intervals goes back to WEIERSTRASS. DEDEKIND introduced the real numbers by cuts in the domain 
of rational numbers. Georg CANTOR (1845-1918) constructed them by means of Cauchy fundamental 
sequences. 

The domains obtained by these methods can be mapped onto one another by order-preserving 
isomorphisms. Therefore, structurally there is only a single domain of real numbers. 


76 3. Development of the number system 


3.6. Continued fractions 


Continued fractions of order nm. Let bo, b,, b2, ..., 6, be integers with b, > 0 for k > 0. The con- 
tinued fraction of order n with the denominators 5,, b2, ..., 6, and the initial term bo is defined by 
the following expression, which is also abbreviated as [bo; b,, b2, ..., by): 

Example: n = 3; bo = 2, b 
-- [ “5 
2+ 1/(3 + 1/(1 + 1/4)). 


Approximating fractions. Let « be a continued fraction of order n. By an approximating fraction 
for « of order k (k < n) one understands the continued fraction breaking off at the kth denominator. 
From the definition of a continued fraction of order k it is clear that it can also be written as an 
ordinary fraction. Then one obtains: 


[503 51] = bo + 1/b, = (bobs + 1)/b1 = Ai/B,, 
[bo; 61, 52] = bo + 1/(by + 1/62) = (62(bob, + 1) + bo)/(b1b2 + 1) = A2/B2, 
[bo; 51, 52, 63] = bo + 1/[b, + 1/(62 + 1/63)) 
= (b3[b2(bob, + 1) + bo] + bobs + 1)/(b3(6162 + 1)4 51) = A3/Bs 

Here all the A; and B; are integers; for example, for the continued fraction [2; 3, 1, 4, 2, 1, 2] 
the second approximating fraction is [2; 3, 1] = 2 + 1/(3 + 1) = 9/4, with A, = 9, B, = 4. 

If one defines for the sake of completeness Ap = bo, A_; = 1, A_2=0; Bo = 1, B_} = 90, 
B_», = 1, then the following recursion formulae can be established by mathematical induction: 


bo + 1 3; 


Example: : 6 given quantities 


138 3/71 calculated quantities. 


by definition 
61 164 


The example [2; 3, 1, 4, 2, 1, 2] shows that in order to reach A, (the case of B, is analogous) one 
multiplies the b, standing above A, by the left neighbour A,_, (which has already been computed) 
and adds its left neighbour A,_». At the two places marked by arrows in the diagram one calculates 
0+1-2=2and7+4+ 9-4= 43. This leads to the approximating fractions A,/B,; = 7/3 = 2.33; 
A,/Bz = 9/4 = 2.25; A3/B3 = 43/19 = 2.2631; A4/Bs = 95/42 = 2.2619...;  As/Bs = 138/61 
= 2.2623...; Ae/Be = 371/164 = 2.262195... = [2; 3, 1, 4, 2, 1, 2]. 

A comparison of the final fraction with the approximating fractions indicates the justification 
for the name: 


The approximating fractions come progressively closer to the final fraction, alternately from 
below and above, with increasing accuracy. 


Every rational number can be expanded in a continued fraction. 


Example: If 964/437 is to be expanded in a continued fraction, one obtains: r= 964/437 
= 2 + 90/437; r = bg + 1/r,; 59 = [r], where [r] denotes the greatest integer not exceeding r, 
r, = 437/90 = 4 + 77/90; r, = 6, + I/ra; ry > 1, if ry is not integral, 6, = [r,], the greatest 
integer not exceeding r,. The continuation of the method finally yields: 

r= 964/437 = 2+ lng a9 
MTB +t, 
r= [234,1,5, 1,12). ha + Vigg , 
The example !/2=0-+ 1/2 = + 1/02 + G/9)), hence [0; 2] = [0; 1, 1] shows that there is no 
uniqueness of the expansion in a continued fraction. This can be achieved by postulating that 
b, > 1, which can always be satisfied because [by; b,, b2,.-., bn, 1] = [bo; 61, bo, .--, ba + 1). 
The expansion in a continued fraction makes it possible to approximate a rational number with 
an unwieldy numerator and denominator by others with smaller numerators and denominators, 
a fact that occasionally is of importance for technical problems. 


Non-terminating continued fractions. Non-terminating continued fractions [bo; b,, 52, ...) are 
used to represent real numbers and can be treated correspondingly. The approximation is better 


3.6. Continued fractions 77 


than that by decimal fractions. Conversely, every real number can be represented by a (finite or 
infinite) continued fraction, in fact: 
The expansion of a real number in a continued fraction terminates if and only if the number is 
rational, 
Here, too, the approximating fractions come progressively closer to the real number, alternately 
from below and above, with increasing accuracy. 


Example: « = 2 is to be expanded in a continued fraction. The symbol [«] denoting the 
greatest integer less than, or equal to, « one obtains the sequence of denominators: 
1. a= bo + (a — [2)), b=[a]), & —[a] = lf; 
20,=5,+(4,—[), & =[o1, 1 —ba]= I/a2; 
3. a2 = bz + (x2 — [%2]), 52 =[%2], &2 — [%2] = 1/53; 


ec Oo PT eT PT TC TP TTT PP eee eee ee) 


ee ee 


Now one makes use of the inequality 1 < )2 < 2 and rearranges: 
la =1+ (V2—1), bo =1, V2—1= Ifay, oy = (¥2+ D/M(V¥2— 1) (¥2+ DJ = ¥24+- 1: 
2, a =24+ (~2—- 1), b, = 2, V2— 1=  l/a3, a2 = vy2+ 1; 
3. to = 2+ (v2 — 1), b, = 2, V2—1=Ijay, Ag = 2+ l; 


By continuing the process one obtains the periodic continued fraction [1; 2,2, ...] belonging 
to }/2. 


Approximating fractions can be found by means of the scheme for the A, und B,: 


Ps es [ees NS eee Vaca See WP. 6 
b, “ies ae 2 ae eee atmanry 2 
Ay ae ees oe ee | Oe ey 
B, ee tee? ee ee, ee 


Hence, for example, A¢/Be = 239/169 = 1.414 201..., while 2 = 1.414 214. 

The approximating fractions converge to //2. 

The fact that the expansion of )//2 in a continued fraction is periodic is no accident; periodic 
continued fractions always arise for the so-called quadratic irrationalities, that is, numbers of the 
form (a + b VD)/c, where a, b+ 0, c+ 0, and Dare integers, and D > 1 is square-free. LAGRANGE 
showed that the expansion of a quadratic irrationality in a continued fraction is periodic. The 
comverse, that a periodic continued fraction represents a quadratic irrationality, had already 
been proved by EULER. 


3.7. The complex numbers C 


In the domain of real numbers the arithmetical operations of the first and second kind can be 
carried out without restriction. This is not the case for the operations of the third kind; for example, 


n 
the power a‘/" = ya does not exist when a is negative and n is even: there is no real number aur 
But square roots of negative real numbers are needed, for example, in the solution of a cubic equa- 
tion by Cardano’s formula (see Chapter 4.), in fact, just in the so-called casus irreducibilis, when 
there are three distinct real solutions. [n order to remove this restriction the number system is extend- 
ed once more. 


Construction of the new numbers. One considers ordered pairs (a, b) of arbitrary real numbers 
aand b. This time the equivalence relation is the ordinary identity, that is, the pair (a, b) is called 
equivalent to (a’, b’) if and only if a = a’ and b = b’; every equivalence class consists of a single 
number pair. 

Such a pair (a, b) is called a complex number. The arithmetical operations of the first and second 
kind are defined by: 


Z, + Z2 = (4, 42,50; + 4), Z1Z2 = (4,42 — b,b2, a,b, — baz), 
a,a2 + b,b, bia, — b2a, 


are apa) 2 +00 


222 = ( 


78 3. Development of the number system 


It is easy to verify that subtraction and division are the inverse operations to addition and multiplica- 
tion. The commutative, associative, and distributive laws hold for addition and multiplication. 
For example, the distributive law is proved as follows: let z; = (a;, b;), i= 1, 2, 3, be three complex 
numbers. Then: Z,[Z2 + Z3] = (a, b;) (a, + a3, b» + b3) 

= (4,42 + 4,43 — byb2 — byb3, ayb2 + ayb3 + byaz + bya3). 
On the other hand, zz, + z1z3 = (a,a2 — byb2, ayb2 + bya2) + (aya3 — byb3, a,b3 + b,a3) 
= (a,a2 — bbz + aya3 — b,b3, a,b + baz + ayb3 + bya3). 
The two expressions are identical. 

Furthermore, all the laws for these operations that hold in the domain of real numbers are also 
satisfied here, except those in which an order relation ‘greater than’ occurs. There are many 
ways of introducing a total order in the domain C of complex numbers, for example, first by absolute 
value and for equal absolute value by argument; but it can be proved that no relationship of total 
order in C is compatible with addition and multiplication. 


Complex and real numbers. The domain of complex numbers contains as part that of the numbers 
(a, 0), which is isomorphic to the domain of real numbers with respect to the permitted operations: 
(a, 0) + (a’, 0) = (a+ a’,0) and (a, 0)- (a’, 0) = (aa’, 0). One can therefore treat such numbers 
by writing simply a instead of (a, 0). The numbers (0, 5) are called purely imaginary. In particular, 
the complex number (0, 1) = i is called the imaginary unit. Brackets for the arithmetical operations 
can now be omitted, because errors need not be feared. One has (0, b) = (b, 0)- (0,1) = bi. and 


Ne DCD [saiory wots |? = -1 ] 


Next, i-1 = (0, 1) - (0, 1) = (—1, 0) = —1. 
Every complex number can be represented as the sum of a real and a purely imaginary number: 
z= a + bi. Here a is called the real part and b the imaginary part of z; a and b being real numbers. 


Graphical representation of the complex numbers. In the plane one draws a Cartesian rectangular 
system of axes and marks on the x-axis the real numbers in the usual way, on the y-axis the imaginary 
numbers with i as unit. To the complex number z = a + ib one assigns the point z with the co- 
ordinates (a,b) or the vector z leading from the origin to this point. These correspondences are 
one to one. To the sum z, + 22 there corresponds by vector addition (according to the parallelogram 
rule) the vector z,; + z, (Fig.). 

To give a geometric interpretation also for the product one represents z = a + bi in terms of 
the length r of z and the angle @ which this vector forms with the positive x-axis; r is called the 
absolute value or modulus and g the argument or amplitude of z (Fig.). It should be observed that 
y is determined only up to multiples of 27 and is measured anticlockwise. 


+7 #27 +3 +h 


3.7-1 Addition of complex 3.7-2. Modulus and argument 
numbers of a complex number 


3.7-3 Multiplication of complex num- 
bers 


The product 2122 = r,(cos g, + isin 9) r2(cos p2 + ising.) can be transformed by means of 
the addition theorems for sine and cosine into z,z2 = r,r2(cos [p, + 92] + isin [p, + ¢2]). This 
representation leads to the following geometric interpretation (Fig.): the triangle formed by the 
points O, z2, and Z; - Zz, is similar to that formed by the points O, (+1), and z,, because the angle 


3.7.The complex numbers C 79 


@, is common to the two triangles and the sides including it have the same ratio r,r2: rz =17r,:1. 
Thus, there is a simple geometric construction for the product. 


Powers and roots. As usual, z" for a natural number n is defined by z° = 1, z"t1 = z"-z. By 
means of the addition theorems and 


important formula of de Moivre: 

For 2 = —1 one obtains z~! = r~‘ (cos (—q¢) + isin (—¢)) = r7! (cosg — ising) 

and by mathematical induction z~-” = r~"(cos (mm) — i sin (”9)). 

By Vz One means a complex number w whose nth power is equal to z, that is, a solution of the 
equation w" = z. Let w= e(cosy + isiny). Then from w" = z = r(cosg + ising) it follows by 
de Moivre’s formula that: 

C=, 6= Vr, yp=g/n+k-2n/n or w= Vr[cos(g/n + k:+2n/n) + isin(g/n + k+ 22/n)). 
If w + 0, then n distinct values arise for k = 0, 1, 2, ..., 2 — 1. In the domain of complex numbers 


n 
the symbol j/z is not restricted to a single value, but is many-valued. How this many-valuedness 
can be mastered is shown in the theory of Riemann surfaces (see Chapter 23.). In particular, 
when z is a positive real number, then the uniquely determined positive real nth root is called 
the principal value. The extraction of roots can be performed without restriction in the domain 


of complex numbers. Among the v values of yz is the number r!/"(cos (p/n) + i sin (g/n)). 


4 
Example; To obtain all values of //(— 1), one sets: 


z = 1 [cos(180° + k - 360°) + isin (180° + & - 360°)) 
w = | [cos(180°/4 + & - 90°) + isin (180°/4 + k - 90°)] 
k = 0 gives wo = cos 45° + isin 45°, 

k = | gives w, = cos 135° + isin 135’, 

k = 2 gives w, = cos 225° + isin 225", 

k = 3 gives w, = cos 315° + isin 315°. 


For k = 4 one has ys = 405° = 360° +- 45”, hence 
Wa = Wo, Ws = W,,--- (Fig.). = 


3 
Example: To obtain all values of )/(+- 1), one sets: 
I [cos (A - 2%) + i sin (A - 22)], 
| [cos (2k7/5) + isin (2k2/5)], 
0 gives the principal value wo = +1 
1 gives w, = cos 72° +- isin 72”, 
2 gives w, = cos 144” +- isin 144", 
k = 3 gives w,; = cos 216° + isin 216°, 
k = 4 gives wg = cos 288° + i sin 288°, 
For k = 5 one has ys = 360°, hence 


2 oo EN 


liu ad 


5 
3.7-5 The complex values of (+1) : 
(fifth roots of unity) Ws = Wo, Wo = Wy, --- (Fig). 


The nth roots of unity lie on the circumference of the unit circle and divide it into n equal parts. 
For rational « the power 2z* is defined in accordance with the corresponding definitions for absolute 
rational numbers and rational numbers. If « > 0 and « = r/s with positive integers r and s, then 


z* is to be interpreted as y2r: if « < 0 and z + 0, then z® is defined as 1/z~*. Of great importance 
is the so-called fundamental theorem of classical algebra, which was first proved, and in more than 
one way, by Gauss. 


80 4. Algebraic equations 


The domain of complex numbers is algebraically closed; this means that every algebraic equation 
with complex coefficients has at least one solution in this domain, 


Historical remarks on complex numbers. Roots of negative numbers were used since the middle 
of the 17th century and were since known as imaginary numbers. The mathematicians of the 17th 
century could rely on a book on algebra by Raffaele BOMBELLI, which dates from 1572 and contains 
a consistent theory of purely imaginary numbers. Later the theory of complex numbers was ad- 
vanced by Johann BERNOULLI (1667-1748), Leonhard EULER (1707-1783), and above all by Carl 
Friedrich Gauss (1777-1855). The representation of the complex numbers in the plane goes back 
to Caspard WESSEL (1745-1818) and Jean Robert ARGAND (1768-1822); it became a generally 
used form of representing complex numbers by the authority of Gauss (the Gaussian plane). The 
complex numbers are the foundation of the theory of functions of a complex variable, or complex 
analysis. 


4. Algebraic equations 


4.1. The concept of an equation ..... 80 4.3. Quadratic equations ........... 92 
Historical remarks ............. 80 Numerical solution of quadratic 
Equations. Solution sets ......... 81 COUQUIONS sacs Gah Pees we Raw ened 92 
Equivalent equations ............ 83 Graphical solution of quadratic 
Solution of problems ............ 85 EQUATIONS ©... cc cece ee 95 

4.2. Linear equations............... 86 Historical remarks ............. 97 
Linear equations with one variable 86 4.4. Equations of degree three and four 97 
Linear equations with two vari- The cubic equation.............. 97 
(216) | oe ee eee ae ee ee ae ee 88 The quartic equations ........... 100 
Graphical solution of linear equa- 4.5. General theorems .............. 101 
tions and systems of equations .... 91 4.6. Systems of non-linear equations . 102 

4.7 Algebraic inequalities........... 103 


4.1. The concept of an equation 


Historical remarks 


Together with numbers, equations belong to the first mathematical achievements of mankind. 
They occur in the oldest written mathematical documents, for example, in the cuneiform texts of 
the old Babylonians, which go back as far as the third millenium B. C., and in ancient Egyptian 
papyri dating from the Middle Kingdom, about 1800 B. C. 

In accordance with the structure of Babylonian society questions of sharing an inheritance were 
of great interest. The first-born son always received the largest share, the second more than the 
third, and so on. Here is one such sharing problem: 

‘10 brothers; 17/, mines of silver. 
Brother has risen over brother (concerning his share). What he has taken, I do not know. 
The share of the eighth is 6 shekel. Brother after brother, how much has he taken?’ 


A mine was an ancient oriental measuring unit, which was subdivided into 60 shekel. The problem 
leads to an arithmetic progression; the youngest brother receives (2 + 48/60) shekel, and each time 
the next one (1 + 36/60) shekel more; the firstborn received (17 + 12/60) shekel, and all ten together 
100 shekel or 12/3 mines. 

While in this Babylonian problem the unknown is described fairly clearly, in Egyptian papyri 
it is denoted by the hieroglyph for ‘h’, which represents heap, collection. Such h-calculations 
occur rather frequently; they correspond to our linear equations. A comparison between an Egyptian 
text from the Moscow papyrus (see Table 10) and modern notation makes this point clear: 


Literal translation Modern notation 
Form of calculation of a heap, counted I'/, times together with 4. One 3x/2 + 4= 10 
He has reached 10. Now what is the heap’s name? 

You calculate the magnitude of this 10 over this 4. What arises 1S 6. we) 10—4=6 ; 
You calculate with 1'/, to find 1, What arises is 2/3. ore 71:3/2=2/3 
You calculate 2/3 of these 6. What arises is 4. | 6: 2/3 =4 


Look: 4 is the name. You have calculated correctly. lx=4 


Before an algebraic symbolic langu- 
age had been developed, equations 
had to be written in words. Even Fran- 
cois VIETE (usually called VIETA, 
1540-1603), who has great merits in 
the field of algebra, made do with the 
Latin verb aequare = to be equal. The 
equality sign = in common use nowa- 


4.1. The concept of an equation 81 


Powbeit,fo2 calc altcratis of 
pounded felve craples, creat oC opcient:3 1608 woe 
rey th aber Anbdto a> 


se pete hy yr 

uoide the tedioufe repetit Gee ena ise: 
qualle to: 3 will fette as % doe often in woozke ble,a 
patre of paralleles,oz Oemotwe lines of one lengthe, 


thus:===~==+,bicaufenoc.2. thpmges,can be moare 


days was proposed by Robert RECorR- cqualle. Ano nolo marke thefe nombers. 


DE (1510-1558), Royal Court Physi- 


cian, but it took a considerable time 1. 1 4.% .—}—, 1 §. fe === =7].8, 
before it was generally accepted. He “re i nm 7 
made this proposal in a textbook of 2. 20,29 ._—.1$.==—=,102.9, 


algebra, written in dialogue form, 

with the title ‘The Whetstone of i 
Witte’ (1557) and motivated it by 
saying (Fig.). ‘I will sette as I doe 
often in woorke use, a paire of paral- 
leles, of Gemowe (twin) lines of one 
lengthe, thus: , bicause noe .2. 
thynges can be moare equalle’. 


26.3 —}—1 Of =—==9.5--—1 0x9 —}—214.9. 


4.1-1 From R. Recorde’s ‘Whetstone of Witte’, 1557. First 
occurrence of the equality sign. Immediately above the formu- 
lae is the motivation for the choice of this symbol 


Equations. Solution sets 


One starts out from a definite set of numbers, the fundamental domain, and of variables for which 
elements of the fundamental domain or of a subset, the domain of variability, may be substituted. 
In specifying the fundamental domain and the domain of variability, N stands for the set of natural 
numbers, Z for that of the integers, @ for that of the rational, R for that of the real, and C for that 
of the complex numbers. In what follows, unless the contrary is stated, the fundamental domain 
is taken to be R. The concept of an equation can be explained by reference to the concept of an 
expression, which is defined inductively (see Chapter 15.). 


Expressions. All numbers and all variables are expressions. Sums, differences, products, and 
quotients of two expressions are again expressions. Also exponentiation and the extraction of roots 
from expressions yield new expressions. Division by zero is excluded; for the time being, in ex- 
ponentiation and extraction of roots the exponent is taken to be a positive integer and the radicand 
positive. 


The concept of an expression can be extended to include, for example, sin x, log, x, e*. 

An expression E, is said to be equivalent to an expression E, if they assume the same value for 
every substitution of the variables by the same numbers of the given domain of variability; for 
example, 4a + 5a and 9a are equivalent expressions with respect to the set R of real numbers, while 
the expressions (x? + x)/x and x + 1 are not equivalent, because (x? + x)/x is not defined for 
x = 0, whereas x + 1 for x = O assumes the value 1. These two expressions are equivalent for the 
set of all real numbers other than zero. 


The following facts are evident: 


1. Every expression E is equivalent to itself. 2. If E, is equivalent to E,, then E, is equivalent to E, . 
3. If £, is equivalent to E,, and E, to E,, then E, is equivalent to E;. 


The domain of definition of an expression involving a vari- 
able is the set of all numbers of the domain of variability for 
which the expression goes over into a number of the domain of 
variability ; for example, the domain of definition of the expres- 
sion (4a — 5)/3 consists of all real numbers, while that of 
x/(x — 3) contains all real numbers other than 3. The domain 
of definition for expressions with several variables is explained 
correspondingly. 

Equations. If two expressions E, and E>, are linked by the 
symbol of equality, an equation E, = E, arises. Here £, is called 
the left-hand side, and E, the right-hand side, of the equation. 

The domain of definition of an equation is the intersection of 
the domains of definition of all the expressions with variables 
occurring in it (Fig.). 


fundamental domain 


domain of variability 


4.1-2 Domain of definition D of 
an equation with one variable as 
intersection of the domains of de- 
finition D,; of the expressions E; 
with variables 


82 4. Algebraic equations 


An equation whose expressions do not contain variables is a proposition in the sense of mathe- 
matical logic, which can be true or false; for example, 3 + 2 = 5 and 3-(5 + 2) = 20+ 1 are 
true propositions, while 2 + 3 - 4 = 15 isa false proposition. But if the expressions contain variables, 
then the equation is a predicate, for example, the equations 3x = —12, 4a+3b5=1 or 
x? = (6x + 24)/3. Only after numbers from the domain of definition of the equation are substituted 
for the variables, the predicate becomes a proposition, which may be true or false. 


Solutions. Every number from the domain of definition of an equation with a single variable 
which after substitution for the variable makes the equation into a true proposition is called a 
solution of the equation, and one also says that the number solves or satisfies the equation. If an 
equation contains two, three, ..., or m variables, then a solution is an ordered pair, triple, ..., or 
n-tuple of numbers with the following property: if the variables are replaced with due regard to 
the order by the elements of the ordered pair, triple, ..., or m-tuple, then the equation goes over into 
a true proposition of equality. 


Examples: 1. The equation 3x = —12 is satisfied by the real number —4, for 3 - (—4) = —12 
is a true proposition. Since there are no other solutions, —4 is the solution of the equation. 

2. The equation 4a +- 3b = 11 is satisfied, for example, by the number pair (2, 1), for it is true 
that 4°2-+- 3-1 = 11. But there are further solutions, in fact, infinitely many, and (2, 1) is a 
solution of the equation. 

3. The equation x? = (6x -+- 24)/3 has the numbers —2 and +-4 as solutions, because both 
= [6 - (—2) + 24]/3 and (+4)? = (6- 4 + 24)/3 are true propositions. There are no other 
solutions. 

4. If the domain of variability for the equation x* = 2 is taken to be the set @ of rational numbers, 
then the equation has no solution, because there is no rational number whose square is equal 
to 2. 


Solution set. The set of a// solutions of an equation relative to its domain of definition is called 
the solution set S of the equation. An equation is called inconsistent or consistent according as S 
is, or is not, the empty set ©. 


Consistent equations Inconsistent equations 


Ix = —28 forxEN; S= © 
x? =—9forxeR: S=O 


7x = —28 for xe Z; § = {—4} 
x? = 9 for xe R; S = {—3; +3} 
4a? = | foracQ; S = {—1/2; +1/2} | 4a*7=1foraeZ; Ss=0 
2x +x = 3x forxec: s=Cc 3x = 3x + 1 for xe; S=0 


A consistent equation with one variable is called universally valid if all the elements of the domain 
of definition are solutions; for example, 2x + x = 3x is universally valid in the set of complex 
numbers. A consistent equation with n variables is called universally valid if every ordered n-tuple 
of numbers from the given domain of variability is a solution of the equation. For example, 
(a + b)? = a* + 2ab-+ b* for a, bER is a universally valid equation, because it is satisfied by 
every pair (a, 6) of real numbers. Every transformation carrying one expression into an equivalent 
expression involves a chain of universally valid equations; for example, the transformation 
(4a + 7a): 2 = 1la- 2 = 22a is an equivalence relative to the set R of real numbers. But the trans- 
formation 


a*—l6a+64 a—1  (—8)?. (a — 1) _ a—8 
5a — 5 a*#— 64 S(a—1) (a+8)(a—8) S5S(a+ 8) 


is an equivalence only relative to sets of real numbers not containing the numbers +8 or 1, because 
these numbers do not belong to the domain of definition of the expressions occurring. 


Equations with parameters. An equation with several variables, say 2a + b= 5 for a,beER, 
can be interpreted in two ways: 

Firstly, the two variables can be regarded as of equal standing and one can ask for all pairs 
(a, b) of numbers satisfying the equation. Then (2, 1), (1/2, 4), (—5, 15) are three of the infinitely 
many solutions of the equation. 

Secondly, one can single out one of the variables and regard the others as auxiliary, as a parameter. 
Then one asks for the solutions of the equation in dependence on the parameter; a solution is an 
expression containing the parameter that satisfies the equation for every admissible value of the 
parameter. In the example above, if a is the variable and b the parameter, then a = (5 — 5b)/2, 
and (5 — b)/2 is the expression for the solution of the given equation, for 2+ (5 — b)/2 + b= 5 
is a true proposition for all be R. If 5b is the variable and a the parameter, then 6b = 5 — 2a, and 
5 — 2a is the expression for the solution, because 2a + (5 — 2a) = 5 is true for allae R. 


4.1. The concept of an equation 83 


In an equation with several variables it must always be stated which are the true variables and 
which the parameters. For example, if in the equation 3x — 2y = 5a + 1 the true variables are 
x and y, while a is to be regarded as a parameter, one speaks of an equation in x and y. 

If in an equation with 7 variables there are no parameters and only true variables, then a solution 
of the equation is an ordered n-tuple of numbers from the respective domains of variability. If there 
are m true variables (0 << m<_n) and the remaining ones are parameters, then a solution is an 
ordered m-tuple of expressions in which, in general, the parameters enter. 


Algebraic equations. In an algebraic equation the variables and the elements of the domain of vari- 
ability are subject only to the so-called elementary algebraic or rational operations: addition, sub- 
traction, multiplication, and division. Examples of algebraic equations in x are: x? — 5x? — 8x 
+12=0; 4(x + a)? (x — b) =c/x. An equation such as 9x — 7 = 4 (5x — 31) can also be 
subsumed under the name of algebraic equation. 

Of course, here both the coefficients and the solutions may be transcendental numbers, as in the 
equation mx? — 5 = 12, which is algebraic in x. The equation sin? x — (1/2) sin x — 1/2 = 0 is 
not algebraic in x, but with a suitable extension of the concept can be regarded as algebraic in sin x. 


Algebraic equations 


in one variable in several variables 
linear non-linear i non-linear 
a+5=12 x? = 27 x+y+z2= (x + 4)? + y? = 16 
3x —4= 27 | x?7-+3x—4=0 + x? + 2=y 
General form of an algebraic equation with one variable. The fundamental domain for the variable x 
is taken as large as possible, the set © of complex numbers. The a;, i = 1, 2, ..., m, can be real or 


complex parameters; @o is called the absolute term. The exponent of the highest occurring power 
of the variable is called the degree of the equation. If a,=+ 0, then the degree of the equation is n. 
If several variables occur in an equation, then one forms for every term the sum of the exponents 
of the variables and calls their maximum the degree of the equation. For example, the equation 
(1/6)x> + 4x — 6=0 is of degree 5, and here a, = 1/6, ag = a3 = az = 0, ay = 4, dg = —6; 
the equation x*y — xy + 3x = 1 is of degree 3. 


Normal form. An algebraic equation of degree n with one variable and with the highest coef- 
ficient a, = 1 is said to be monic or in normal form. This can be obtained from the general form 
on division by a,, + 0. 

Transcendental equations. All equations with variables that are not algebraic are called trans- 
cendental. Among them are exponential, logarithmic, and trigonometric equations. For their 
solution methods are required that transcend the means of algebra — quod algebrae vires transcendit, 
as EULER put it. Frequently graphical or approximation methods are used to solve them (see 
Chapter 10. — Trigonometric equations). 


Equivalent equations 


Two equations with variables are said to be equivalent if they have the same domains of definition 
and the same solution sets. Otherwise the equations are called inequivalent. 


Examples: 1. The equations 4a + 2 = 10 and 6x = 12 are equivalent relative to the set R 
of real numbers, because the solution set of each consists of the number 2 only. 

2, The equations a? = 9 and x* = 27 are inequivalent relative to the set Z of integers, because 
the solution set of the first equation consists of +3, that of the second of +3 only. However, 
relative to the set N of natural numbers these equations are equivalent, for then the solution set 
of each consists of the number 3 only. 


Example 2 shows that the notion of ‘equivalent equations’ has a meaning only relative to given 
domains of variability or the resulting domains of definition, and the same is true of ‘consistent’, 
‘inconsistent’, and ‘universally valid’. Relative to equal domains of definition both universally valid 
equations and inconsistent equations are always equivalent. 


eet eee EF 

1. Reflexivity: Every equivalent to ; 

2. Symmetry: If one equation is equivalent to another, then the Deageahocd ppg -t giohreg 

3. Transitivity: If one equation is equivalent to a second and the second to a third, then the first is 


84 4. Algebraic equations 


Consequently, the equivalence of equations is an equivalence relation (see Chapter 14.). 


Equivalent transformations. In transformations of equations with variables one distinguishes 
between equivalent and inequivalent transformations. If an equation (1) is transformed so that the 
resulting equation (2) is equivalent to (1), then one says that (2) arises from (1) by an equivalent 
transformation. 

If S, and S, are the solution sets of the equations (1) and (2), then an equivalent transformation 
is therefore characterized by the fact that S,; = S,. In all other cases the transformation is said to 
be inequivalent. This is so, in particular, when S, C S2, that is, when the transformation has led 
to additional solutions, or when S,; > S>, that is, when solutions have got lost in the transformation. 
In the case S, C S, those solutions of the equation (2) that are not solutions of the equation (1) 
can be sorted out by a check in (1). 


Examples: 1. The transition from (1) 4x = 20; x EN, to (2) x = 5; x EN, is an equivalent 
transformation, because S$; = S; = {5}. 

2. The transformation of the equation (1) x = 6; x eZ into (2) x(x + 2) = 6(x + 2); xeZ, 

is inequivalent; in this case S$; = Nh S; = {—2; 6}; hence S.C S;. 

3. If one goes from (1) x? = x 12x; xeZ, to (2) x? = x + 12; x€Z, on division by x, 
one has 5, = {—3, 0, 4}, 8S. = {-—3, 4): hence S, — S;, that is, the transformation i is inequivalent. 
Transformations leading to a loss of solutions can occur, for example, on dividing an equation 

by an expression containing a variable or by extracting a root from the equation. If in the Solution 
of the equation one performs inequivalent transformations, then additional investigations are 
required to determine the solutions that may have got lost or those that are not solutions of the 
original equation. Such complications can be avoided if only equivalent transformations are per- 
formed. Therefore it is very important to know what transformations of an equation are equivalent. 
The following theorems, in which the domain of definition is R, give some relevant indications. 


Proposition 1: An equation E, = E, is equivalent to an equation E, = E, if the expressions £, 
and £, as well as E, and E, are equivalent. 


According to this proposition one may, in particular, contract terms, divide fractions by num- 
bers, or multiply brackets; for example, the equations 4x + 7 — 2x + 15 = 8x — 6x + 13 — 3x 
and 2x + 22 = —x + 13 are equivalent because S$, = S, = {—3}. 


Proposition 2: An equation E, = E, is equivalent to E, = E£,, that is, by interchanging the sides 
an equation goes over into an equivalent one. 

Proposition 3; By adding (or subtracting) to both sides of an equation E, = E, one and the same 
expression E,, defined for the whole domain of definition of E, = E,, then the equation FE, + EF, 
= E, + E, (or E, — Ey = E, — E,) is equivalent to the original equation. 


Example: (1) 8x — 29 = 4x +- 31 | + (29 — 4x) E, = E; | + Es 
Sen Ne ge et ee E,+ —£;=>£&,+&; 
(2) 4x = 


According to Theorem 3 the equations 8x — 29 = 4x + 31 and 4x = 60 are equivalent. In 
fact, S$; = S; = {15). 


Example which shows the necessity of restricting E;: 
While the solution set of (1) is S; = {4}, 
(1) x=4 | a 


or that of (2) is S,; = © because 1/(x — 4) is not 

| 1 defined for the number 4. Consequently, since 

(2) x+ =4+4 —., S, = S;, the equations (1) and (2) are inequi- 
a x—4 valent. 


Proposition 4: If one multiplies (or divides) both sides of an equation E, = E, by one and the same 
expression E,, defined for the whole domain of definition of E, = E, and different from zero there, 
then the equation E, - E, = E,- E, (or E,/E, = £;/E,) is equivalent to the original equation. 


Example: (1)6a=-—3 | :6 E, = E, | 3B 
6a/6 = —3/6 E,/Es = E,/Es 
(2) a= —1/2 


The equation 64 = —3 and a = —1/2 are equivalent by proposition 4. In fact, S; = S; = {—1/2}. 
By way of contrast, the transition from 


a —d4 3 is an inequivalent transformation, because 
(1) ee oe ‘(a+ 4) the expression (a + 4) takes the value 0 for 
ta (2) ee a = —4, Since S, = @ and S, = {—4}, (1) 


and (2) are inequivalent equations. 


4.1. The concept of an equation 85 


The propositions above require proofs, which will, however, be omitted here. There is no such ge- 
neral equivalence theorem for raising to a power or extracting roots, because these operations can 
lead to ane quwalent equations, as is shown by the forowine ach aoe 


- eS = =—_ 


pect Late eatin, : 


te Fe. Apeasoice ek se =O) and = 3,0}, 
ADVI PSR) SOS os ee ott Re {Ohi adereg 

Solving equations. To solve an equation | means to give all solutions relative to given domains 
of variability, in other words, to give the solution set whose elements can be numbers, number 
pairs, n-tuples of numbers, expressions with parameters, or n-tuples of such expressions. 

In particular cases the task of solving an equation can be accomplished by systematic trial and 
error, in the case of the simplest equations by reading off the solutions directly, and in general 
by working through a method of solution or a solution algorithm. Such solution methods and al- 
gorithms consist in most cases of step-by-step equivalent transformations of the given equation, 
until finally an equation arises whose solutions can be read off. 


Model solutions for a linear equation with one variable, by using the equivalence theorems. The 
aim of the transformations is to obtain an equation so simple that its solutions can be read off 
directly. 


7x —2— 5x —4x+3+3—8 This is a chain of equivalent equa- 
( Proposition 1 tions. Since equivalence is transitive, 
2x —2 =-—x—) +2+x the last equation x = —1 is equivalent 
( Proposition 3__ to the original equation. The number 
2x —2+2+x —-x—§+2+%x —1 evidently is the only solution of the 
( Proposition 1 equation x = —1 and so the only solu- 
3x = — J :3 tion of the original equation. Every 
( Proposition 4 equation in the chain has the solution 
30x = —3/3 set S = {—1}. 
( Proposition 1 


x = —], 

Check. When an equation with variables has been solved, it is necessary to check whether the 
solution set has been found correctly. If all the transformations are equivalent, the check has the 
purpose of spotting calculating errors and of verifying that the solutions belong to the domain of 
definition; if also inequivalent transformations have been used, then the check indicates whether 
additional solutions have arisen; it cannot tell whether solutions of the original equation have 
got lost. 

The check has to be carried out in the original equation. In the first part of the check one replaces 
all variables by the numbers that have been found for them; for example, in the model above: 

7-(—1) —2—5:(-1) = —4:(—1)+34+3:-(€C1)— 8 
—7—2+5 4+3—3-—8 
—4 —4, 

This is a true statement and confirms that the calculations have been correct. In the second part 
of the check it has to be verified that the numbers found belong to the domain of definition, in the 
present case to R. Since —1 €R is a true statement, the solution set is, in fact, S = {—1}. 

If another domain of variability is assumed, for example, x € N, then the first part of the check 
proceeds as above, but in the second part one obtains —1 ¢N, so that the solution set is S= ©. 


Solution of problems 


The problems in question concern either a mathematical situation expressed in natural language, 
or a practical situation from one of the domains of application, for example, the natural sciences, 
technology, or economics. In both cases the task is to translate the text into the formalized language 
of mathematics. This can result in equations with variables; for example, the text: ‘If 7 is added 
to three times a natural number, the result is the same as subtracting this number from 13’ leads 
to the equation: 3x + 7 = 13— x; x EN, by introducing the variable x in place of the required 
number. 

Usually the ‘translation’ of a problem leads first to an equation between quantities and variables 
for quantities, and then from this to an equation with numbers and variables for numbers. 


Example: A fir tree of height 9 yards breaks off 4 yd. above ground. How far from the foot of 
the tree does the tip of the tree hit the ground? — For the solution such verbal problems the follow- 
ing scheme is recommended: 


86 4. Algebraic equations 


1. Fixing the variable, if possible by means of a sketch (Fig.). 

The tip of the tree hits the ground x yd. from the foot of the tree. 

2. Setting up the equation(s) and determining the domains of variability. 
Equation (4 yd.)? + (x yd.)? = (5 yd.)? with quantities or with 
numbers and variables for numbers 4* + x? = 5? with xe R and x > 0. 

3. Solving the equation(s). 

x? = 25 — 16 = 9, hence x, = 3 and x, = —3. 
4. Check with reference to the meaning of the text. 
From the text or the resulting domain of variability it is evident that only 
x, = 3 can be regarded, and in fact is, the solution of the problem 4.1-3 Broken fir tree 
expressed in words. 
5. Answer. 
The tip of the tree hits the ground 3 yards from the foot of the tree. 


4.2. Linear equations 


In a linear equation or equation of the first degree, all the variables occur only to the first power; 
for example, 5x — 2 = 8, 3a + 2b=4, 4u+ 5v+ 3w — 1=0 are linear equations with one, 
two, and three variables, respectively. The equation (x + 4) (x + 3) = (x + 1) («+ 7) for xER, 
although not linear, is equivalent to a linear equation, because it can be brought to the form 
x—5=0 for xeER by multiplying out and rearranging. However, the equation (x + 4) (x + 3) 
= 6 for xER is equivalent to the non-linear equation x? + 7x + 6=0; xéR. Also fractional 
equations and equations in root form can be equivalent to linear equations. 


Linear equations with one variable 


In the general form x is the variable, a and 5 are real parameters. One calls ax the linear term 
and 6 the absolute term. The case a = 0 is included in the discussion, though in the following this 
case will be excluded because an equation ax + b = 0 in which a = 0 is not, strictly speaking, 
linear. By equivalent transformations every linear equation with one variable can be brought to this 
general form. In the solution of the equation ax + 6 = 0 three cases have to be distinguished. 


Case 
distinction 


number of 
solutions 
| solution set 


When the domain of variability is altered, it is, of course, quite possible that x = —b/a is nota 
solution or is a universally valid solution. For example, the equation 5x -+ 10 = 0; x ER, has the 
solution set S = {—2}; but if the domain of variability is N, then the solution set is empty: S = @, 
because —2 ¢N. Finally, if the domain of variability is {—2}, then the only element of the domain 
of variability is also the only solution of the equation, which is therefore universally valid. 


Example /]: Linear equations without a parameter. 
4a/3 +- 1/2 — a = —3/2 + 2a/3 + 5/2;aEQ 
a/3 +- 1/2 = 2a/3+ 1 | —1/2 — 2a/3; 
—a/3 = 1/2 :(—1/3); 


a = —3/2 


4.2. Linear equations 87 
Check: 1. (4/3) - (—3/2) + 1/2 — (—3/2) = —3/2 + (2/3) (—3/2) + 5/2 2. —3/2€Q true 
—2+ 1/2 + 3/2 = —3/2—1+4 5/2 
0 = 0 true 
The solution set is therefore: S = {—3/2}. 


Example 2: Equation with the variable x € R and the parameters a, b € R that leads to a linear 
equation in x. 


(x + a)? — (x — by? _ = 2a(a + 5); 

x? + 2ax + a? — x? + 2bx — b? = 2a? + 2ab | —a*?+ 5? 
2ax + 2bx = a* + lab + b? 

2x(a + b) = (a + b)? | :2(a2+ 8) 


Here a case distinction is necessary: 


First case: If (a + b) + 0, division leads to x = (a + b)/2, so that S = {(a + b)/2}. 

Check: 1. ((a +- b)/2 +- a]? — [(a + b)/2 — 6)? = 2a(a + 5), 

[(3a +- b)/2}? — [(a — b)/2]? = 2a? + 2ab, 

[9a? + 6ab + b? — a* + 2ab — b?)/4 = 2a? 4+- 2ab, 2a* + Jab = 2a? + 2ab. 

This is a true statement for all real numbers a and b. 

2. (a + b)/2 ER, because a, be R. 

Second case: Ifa+6=0, that is, b = —a, the given equation is (x + a)? — (x + a)? = 2a(a — a). 
It is equivalent to 0 - x = 0 and has the solution set S = R. 

Check: (x + a)? — (x + a)? = 0 is true for every real number and every parameter a € R. 


In fractional equations at least one of the variables occurs at least once in the denominator of a 
fraction. 
3 


2 
Example a: SAT, J + PER. B 
For all real numbers x + +2 and x +0 multiplication by the least common denominator 
x(x — 2) (x + 2) is an equivalent transformation and leads to a linear equation. 


2x(x + 2) + 3x(x — 2) = S(x — 2) (x + 2) 


suo 
a. 


2x? + 4x + 3x? — 6x = 5x* — 20 — 5x? 
—2x = —20 :(—2) 
x= 10 
The check has to be made in the original equation. 

2 3 5 ‘ | 

l. 10 2 + OD 10 2. It is true that 1OE€ R and 10+ +2, 10+ 0 
2/8 + 3/12 = 1/2— true 
1/2 = 1/2 true 


The solution set is therefore S = {10}. 
Example 4: The fractional equation contains the parameter a: 
x+2a , x—2a 4a 
ay ae oe 1 ee 
(x + 2a) (x + 2a) — (2a — x) (2a — x) = 4a? 
x? + 4ax + 4a? — 4a? + 4ax — x? = 4a? 


8ax = 4a?, 
Firstcase: a+0Q Secondcase:a=0 
x =a/2 x/(—x) + x/x = 0/(—x?) | -(—x*) |x +0 
S = {a/2} +x? — x? =0 
0- x? =O 
All real numbers other than 0 are solutions of 
The check shows that this is correct. this equation. 


In equations with roots at least one of the variables occurs at least once in the radicand of a root. 
In the simplest cases exponentiation eliminates the roots, but it has to be observed that this may 
be an inequivalent transformation leading to additional solutions. 


3 . 
Example 5: ¥(x +- 2) = 3 is a root equation equivalent to a linear equation. 


4 
Vix+2)= 3 | third power Check: 
x+2= 27 —2 1, ¥(25 + 2) =3 
x= 3= 3 tu 


S = {25} 2.25 € R true 


88 4. Algebraic equations 


Example 6: If several square roots occur, one of these can be isolated before exponentiation. 


14 = V(x— 4)+ V(r4+ 24) Check: 
Vix — 4) = 14 — v(x + 24) | squaring 1. 14 = (40 — 4) + )(40 4+ 24) 
x — 4= 196 — 28 y(x + 24) + x-+ 24 14=6+8 
28 V(x + 24) = 224 14=14 true 
Vix+24)= 8 2.40ER ‘u 
x+ oi = 64 i _ 
= 40 
S= {40}, 


Fractional and root equations reducing to quadratic equations can be found in the appropriate 
sections. The following examples of applied problems, which in every case lead to linear equations 
with one variable, are to be regarded as models of frequently occurring types. 


Example 7: Mixing problem. In a Siemens-Martin furnace 20t steel of 0.5% carbon content 
are melted together with 5t pig iron of 5% carbon content. What is the percentage of carbon 
in the mixture? — 7 ) ? 

Let x % be the carbon content of the mixture, that is, 25 t of mixture contain 25 - x/100 t carbon. 
The 20 t steel contain 20-0.5/100 t, and the 5t pig iron 5 - 5/100 t carbon. Since the sum of the 
carbon content of the parts must be equal to the total carbon content, one obtains the equation 
25 + 0.5/100 + 5+ 5/100 = 25+ x/100, which shows that the carbon content of the mixture is 
1.4%. 


Example 8: Distribution problem. 3 excavators together move daily 31000m* earth. The 
second bulldozer moves 1000 m* more than the third, and the first 4000 m* less than twice 
the amount of the second. What is the amount of earth moved daily by each of the 3 bull- 
dozers? — If the third bulldozer moves x m?° earth, then the second moves (x +- 1000) m*, 
and the first moves [2(x + 1000)— 4000] m* earth. The three bulldozers together, move 
31000 m? = {x + (x + 1000) + [2(« + 1000) — 4000}} m°*. The calculation leads to S = {8000}; 
this means that the third bulldozer moves 8000 m‘, the second 9000 m*, and the first 14000 m°, 
and all three together 31000 m? as required. 

Example 9: Simple problem of motion. A train of length 250 yd. passes through a tunnel of 
length 200 yd. at a speed of 50 miles per hour. How long does it take to pass through the tunnel? — 
Let x seconds be the time between the entrance of the locomotive into the tunnel and the exit 


of the last carriage. During this time the last carriage passes a yd., but this is the length 
of the tunnel plus the length of the train 60 - 
50.1760 at 
200 + 250 = ———_— 60%? S = {18.4}. 


The passage takes 18.4 seconds. 

Example 10: More complicated problem of motion. A barge going downstream reaches its destina- 
tion in two hours. Going upstream for the same distance with the same machine power, it needs 
three hours. Its velocity in still water is 250 yd./min. What is the velocity of the moving water? — 
Let x yd./min be the velocity of the streaming water; then the steamer has the downstream 
velocity of (250 + x) yd./min and takes 120 min; but upstream at the velocity of (250 — x) yd./min 
it needs 180 min for the same distance. Therefore the equation is: (250 + x) + 120 = (250 — x) - 180. 
The velocity of the water is 50 yards per minute, 


Linear equations with two variables 
The solution set of a linear equation with two variables, for example, 4x + 3y — 10 = 0 with 
xeéER, yeER, consists of all ordered pairs of real numbers (x, y) which on substitution make the 
equation into a true statement; for example, (1, 2) is a solution, because 4-1-+3-2—10=0 
is a true statement. (0, 10/3) and (+3, 2/3) are likewise solutions. If one postulates that x and y 
are to be natural numbers, then solutions are pairs of natural numbers satisfying the equations; 
other domains of variability for x and y can also be prescribed. 


Systems of two linear equations. Linear algebra provides methods of solving m linear equations 
with n variables. If it is required to satisfy simultaneously m equations with n variables, one speaks 
of a system of m equations with variables. Every solution of such a system is an ordered n-tuple 
of numbers. Here only the case m = n = 2 will be treated in detail (for arbitrary m and n see 
Chapter 17.). Every solution of such a system of equations is an ordered pair (x, y) of numbers. 


4.2. Linear equations 89 


Solving a system of two linear equations. To solve a system of two linear equations with two 
variables means to determine a// ordered pairs (x, y) satisfying both the first and the second equation; 
in other words, one has to determine the intersection S of the solution sets S,; and S2 of the two 
equations. Here the only possible cases are the following three: 


1.S = S,; ~ S; = {(a, b)}. The system of equations has a unique solution; 

2.5 = S, ~ S; = @; the equations of the system are inconsistent, incompatible, or contradictory; 
3. $= S, 9S, = S; or S2; the system of equations is not uniquely soluble; it then has in- 
finitely many solutions. 


This last case occurs if and only if the two equations are linearly dependent, that is, if one equation 
is a real multiple of the other equation. For the numerical solution of systems of two linear equations 
with two variables among the elementary methods available are the method of substitution, of 
equating, and of adding. They all aim at eliminating one of the variables, so that there are only 
two linear equations with one variable which each are to be solved. In the substitution method one 
equation is solved with respect to one of the variables, and the expression obtained is substituted 
in the other equation. 


Example 1:(1) x+y=-3 |». x + (2x + 6) = —3 
(2) —2x + y=6 —e y = 2x + 6 —_——" 3x + 9 = 0 | 
( y is eliminated x = —3 
y=2:-(—3)+ 6 


y=0 y is calculated. 
The check must be carried out for both initial equations: 
l1.(! —3+0=-—3 true 2.—3ER true andOeER true 
(2)0— 2:(—3)=6 ‘true 
on ee a (—3,0) is the only solution of this system of equations. The solution set is 


In the method of equating the two equations are solved with respect to the same variable, and the 
expressions so obtained are equated; hence the method is based on the transitivity of the equivalence 
of expressions. 


Example 2:(1) x —2y= 4 . = 4+ 2y 
((2) 2x + Sy = 38 | —= x = (35 — Sy)/2 
x—2:‘3= 4 4 + 2y = (35 — S5y)/2, here x is eliminated 


’ s=10 ——___ ya 
The solution set is § = {(10, 3)}. 


The addition method. By multiplying each equation by a suitable number it can always be achieved 
that the coefficients of one of the two variables in the two equations are opposite numbers. By 
adding the two equations one of the variables is eliminated. 


Example 3: 
(1) si2x— 8y=4|-3————______________, (1) 36x — 24y = 12 { 
ie — [S5y=3 |: * (—2) ————— (2) —36x + Wy = —6 
a l1=4 O-x+ 6y = 6 


kag EIR zeae nen oD 
The solution set is 5 = {(1, 1)}. 


It is a matter of experience to recognize which of the methods is the most suitable in an individual 
case. 

To obtain a survey of the solutions of a system of two linear equations with two variables x and y 
on distinguishes between two principal cases: I. For a; = az = b2 = b, = 0 in the case c, = C2 
= 0 an arbitrary pair of real numbers is a solution of the system. If, however, one of the numbers 
C, OF C2 is different from zero, then there is no solution. 


(1) ax + by=c 
(Qhax+by=c, *eRrER 


II. If at least one of the coefficients a,, a2, b,, bz is different from zero, then there are three 
possible cases. 


90 4. Algebraic equations 


Equations 


solution set 


graphical 
illustration 


Example 4: 
(1) 4y(10x — 3) — 5x(8y +- 7) + 165=0 multiplying out and ordering 
(2) 9x(4y — 7) + 35 — 12x) =—114 


(1) —35x—12y= Se addition method 
(2) —63x + 1Sy = —114 |-4 —35-3— 12y = —165 
(1) —175x — 60y = —825 | +| y=5. 
(2) —252x + 60y = —456 
—427x = —1281 | 
x=4j 
The check confirms that the calculation is correct. Here S = {(3, 5)}. 


Example 5: The equations are given in fractional form. It is to be observed that the denominators 
must be different from zero. The solution set is S = Sat 2)}. 


: (—427) 


(1) x+y+1 ie oy seers LS “(1) x+ y=5 
x+y-—1 ey " x+ytiil a Se equ ae —2 
Example 6: The variables are x and y, +,()x+y= - 
whee and b are real ‘ar appa The F1 (2) x-y= 
solution set is S = {(a + b, a — b)}. 56 mb 4. 28 Ye a Bee 
x=a+b y=a-—b 

Example 7: The second equation is a (1) vo — 2n/3 = 1 |—— (1') 3v — 2n = 3 
multiple of the first; then every ordered (2)6v—6 = 4n |———® (2') 6v — 4n = 6 


pair satisfying the first equation satisfies 
also the second. There are infinitely many 


solutions. 
Example 8: The equations are inconsi- (1) 4a + 36 =7 —» (1')4a+35=7 
stent. There is no solution. S = ©. (2) 4(a — 2) = —3b |——® (2') 4a+ 3b5=8 


Applied problems leading to a system of linear equations. 


Example 9: Distribution problem. A water container can be filled from a hot and from a cold 
tap. If the hot tap is left on for three minutes and the cold tap for one minute, then 50 quarts have 
flown in. But if the hot tap is left on for one minute and the cold tap for two minutes, then 40 
quarts have flown in. How many quarts of water flow through each tap in one minute? — To intro- 
duce the variables one assumes that the hot tap yields x quarts/min and the cold tap y 
quarts/min; so one obtains the system of equations to the right and then fF 


the solution: the warm tap yields 12 quarts per minute, and the cold tap 
14 quarts per minute. 


4.2. Linear equations 91 


Example 10: Mixing problem. To prevent the water in the cylinder block and cooling system 
of a car from freezing one adds at the beginning of winter an antifreeze of the density 1.135 anata 
to the density 1 of the water in the radiator. If the mixture has a density of 1.027, one obtains frost 
protection up to —10°C (14°F). How many quarts of antifreeze and how many quarts of water 
are to be mixed in order to obtain 100 quarts of the mixture? — 

To introduce the variables one assumes that the amount of antifreeze is x quarts that of water 
is y quarts. Then one obtains the system of equations and from it [~~ 
the solution: one mixes 20 quarts of antifreeze with 80 quarts of 
water to obtain the desired mixture. 


Graphical solution of linear equations and systems of equations 


In solving equations graphically one sets up a one-to-one correspondence between the solutions 
and certain sets of points. By representing these point sets in a coordinate system one obtains 
approximate solutions for the equations. The coordinate system to be used here is rectangular 
Cartesian. 


Graphical solution of one linear equation with one variable. To solve the equation ax + b = 0, 
a + 0, graphically one goes over to the function represented by the equation y = ax + b, a+ 0. 
Its graph is a straight line (see Chapter 5.). The zero of the function, that is, the abscissa of the 
point of intersection of the line with the x-axis, is the solution of the sles equation. 


equation 


Graphical solution of systems of two linear equations with two variables. The solution sets of the 
two equations are represented graphically, and their intersection is determined. For this purpose 
one interprets the given equations as functional equations and draws the graph of these functions. 
In general they are straight lines. The coordinates of all points of the first line, and only they, satisfy 
the first equation, and those of the second line, and only they, the second equation. From the drawing 
one determines a// points lying both on the first and second line, that is, in their intersection. To 
the coordinates of each of these points there corresponds one-to-one a solution of the system of 
equations. Depending on the relative position of the lines one obtains exactly one, none, or in- 


finitely many common points, that is, the system of equations is uniquely soluble, insoluble, or 
not uniquely soluble. 


4.2-1 eave solu- 4.2-2 Graphical solution of 4.2-3 Systems of linear equations: 

tion of the equation the system of equations a) with no solutions, 

2x —6=0 4x —y=2,x—2y= —-3 b) with infinitely many solutions 
Example 2: To solve the system of equations Le tage one repre- (1) 4x —y=o2 


sents the functions with these equations graphically. The point of | (2) x — 2y = —3 

intersection P = (1, 2) yields (1, 2) as the only solution so rear a vo 

of equations (Fig.). is 
Example 3: In the graphical solution of the system equations 

one is led to two coincident lines. Consequently, the coordinates 

of every pletion fe See Oe ee ces a aa system (I 


ie a 
graphical representa 2 equati te ook +3 ey elves 
two parallel lines, that is, no point of intersect ee ee ire tha pisses S 3 =O, 


92 4. Algebraic equations 


4.3. Quadratic equations 


In a quadratic equation or equation of the second degree with one variable this variable occurs 
at least once to the second power and not at all to a higher power; for example, 2x? -++ 5x = 16— x 
is a quadratic equation in x and a? = a?/2 + 6 is a quadratic equation in a. The fractional equation 
3/(u — 2) + 8/(u + 3) = 2, after multiplying both sides with the least common denominator 
(u — 2)(u-+ 3) and rearranging, leads to a quadratic equation, namely 2u? — 9u — 5 = 0. If 
several variables occur in an equation and if the sum of the exponents for at least one term is two, 
but never higher, the equation is also called quadratic; for example, x? + y? = 4 and x- y=const 
are quadratic equations with two variables. 

The following treatment concerns quadratic equations with one variable. 


Here x is the variable, and A, B, C are real parameters. The term Ax? is called the quadratic term, Bx 
the linear term, and C the absolute term. A + 0 is necessary, because otherwise the equation is linear. 

If one divides the general form Ax? + Bx + C =0 on both sides by A +0 one obtains the 
equivalent equation x? + (B/A) x + C/A = 0. The abbreviation B/A = p and C/A = q leads to 
the normal form. 


It is characterized by the fact that the coefficient of the quadratic term is +1. If all the terms really 
occur, that is, p + 0 and q + 0, one speaks of the Eee GUE equation in normal form. 


IL. p=0; a0 


"pure quadratic 
- equation 


special cases 


Numerical solution of quadratic equations 

I, Solution of a pure quadratic equation without absolute term. The equation x? = 0 or = x=0 
can only have the solution x, = x, = 0, that is, the solution set S = {0}, since for x = 0 also 
x? > 0, and vice versa. 

II. Solution of a pure quadratic equation. For g > 0 the equation x? + q = 0 with g = R cannot 
have a solution with x € R, because in this case the expression on the left-hand side always satisfies 
x? -+q>0. 

For gq <_0, hence (—q) > 0, the expression x? + g can be written by means of the binomial 
formula a? — b* = (a— b)(a +b) as a product of two linear expressions in x: x?-+q 
= x? — (V—q)? = (x — V—q) (x + V—q). Consequently, the equation (x — /—gq) (x + V—q) = 0 
is equivalent to the given one. Since the product EF, - E, of two expressions £, and E, is zero if 
and only if E; = 0 or E, = 0, it follows from the equivalent equation that x — V—q=0or 
x + V-q= = 0. So the solution of the pure quadratic equation is reduced to that of two linear 
equations. From the first equation one obtains x, = //—q, and from the second x2, = —)/—g. 
Hence the given equation has two solutions x, and x2; this is expressed in the combined solution 
formula X1,2 = +V—dq. The solution set Sis the union of the solution sets of the two linear equations, 
that is, S = {/—q, —V—4q}. 

Check. 1. (4V—9)? +4 = 2. +V—q is real provided that q < 0. 

a 
0 = 0 true 

But if one chooses for the domain of variability the set © of complex numbers, then for gq > 0 
there exist two imaginary solutions which differ by sign only and which can be obtained formally 
in the same way, by splitting the expression x? + q into (x + /—q) (x — V—@q). 


4.3. Quadratic equations 93 


Examples: 
l.x?—4=0;xER 2.x7 + 144=0 > 
“x2 =ty4 X12= +)V—144 
X20 = +2; S= {—2, +2}. For x€R there is no solution, because 
V—144¢R; here S = O. 
Check: 1. (+2)? —4=0 For x € C one has x, 2 = +12iand 
4—4=0 true S = {—12i, +12i). 
2.+2ER true Check: 1. (+12i)* + 144=0 
—2eER true —144+ 144=0 true 


, 


l2ie C true 

or a mb | Hutte * Taking x before a bracket 
transtorms the equation x“ + px = U into the equivalent equation x(x + p) = 0. From this it 
follows that x = 0 or x + p= 0. The first of these two linear equations has the only solution 
x; = 0, the second x, = —p. Hence the mixed quadratic equation without absolute term always 
has two real solutions of which one is zero; the solution set is S = {0, —p}. 


Check: 07 + p-0=0 _ true for all pe R; 
(—p)? + p(—p) = 0 true for all pe R. 


= {0,2/7} Check: 


Example: 7x? — 2x 


=0 
x(7x — 2)=0 forx;:7-0—2:0 true and OcER true 
x(x — 2/7) =0 for x2: 7+ (2/7)* » —2:2/7=0 
4/7 —4/7=0 true and 2/7ER_ true 


IV. Solution of a mixed quadratic equation x? + px + g = 0. The idea of the solution is to make 
the expression x? + px into a perfect square by adding a suitable term, and so to reduce the equation 
to a pure quadratic one; here the quadratic supplement is the square (p/2)* of half the coefficient 
of the linear term px in the normal form. To make an equivalent transformation of the given equation, 
one has to add (p/2)? — (p/2)”. For example, for the equation x? + 2x — 5=0 one has p = 2 
and (p/2)? = 1. By addition of 1 — 1 this equation goes over into x? + 2x +1—5—1=0Oor 
(x + 1)? — 6 = 0, that is, into a pure quadratic equation in (x + 1), and from its solutions those 
of the given equation can be obtained. From (x + 1);,2 = +£/6 it follows that x,,. = —1 + 6. 
To achieve that in the following solution method the pure quadratic equation is always soluble, for 
a while the domain of variability is taken to be the set © of complex numbers. 

Solution method: x* + px + q = 0, | + (p/2)? — (p/2)? 
quadratic supplement: x? + px + (p/2)* — (p/2)? + q = 0, 
pure quadratic equation: (x + (p/2))? — [(p/2)* — q] 


its solution: (x + p/2);.2 
solution teas X12 —p/2 + Very — q). 


2. —pl2 + VIlp/2 — glee true 


The solution formula also contains the solutions for the special cases, as one can verify by sub- 
stituting p = 0 or g = 0 or both. It is applicable whenever the equation is given in its normal form, 


Discriminant. Evidently the nature of the solutions of the quadratic equation is determined by the 
radicand D = (p/2)? — q of the root in the solution formula. It is called discriminant. If p and q 


94 4. Algebraic equations 


are real parameters and if one returns to the set R of real numbers as domain of variability, then 
three cases are to be distinguished: I with two distinct solutions, II with two equal solutions and 
III with no real solution. 

If one chooses as domain of variability the set C of complex numbers, then two conjugate complex 
solutions occur in the case D < 0. Choosing as domain of variability a subset of the set of real 
numbers, the solution set may be different. 


Example |: 
x*-+4x—5=0; xeER Check: 
X12 = —2+ y(2? + 5) for x,: 
X12=-—2+ y9 7+4-1—5=0 
x,=!1 for x3: 
x,3=-—5 (—5)* + 4(—5) —5=0 
S = {—5, 1} 25 — 20—5=0 true | —S5ER true 


poy are two distinct real solutions. But if one chooses, for example, x € N, then S = {1}, because 


Example 2: Check: 
2x? — l6x+36=0; xeER 2(4 + i 2)? — 16(4 + i 2) + 36 = 
—8x+18=0 2(16 + 8i 2 — 2) — ot Nei? + 360 
$124 = WC 1h 32 + 161 72 — 4— 645 161/24 36 =0 
s2=4+y- O=O0O true 


S = ©; there are no real nthe because /—2¢R. But if one chooses x €C, then there are 
two distinct complex solutions, S = {4 + i 2, 4 — i |/2}. 


Psy ie 3: The equation x? — 14x + 49 = 0 with x éR has two coincident real solutions: 
= {7}, as the check shows. 


Example 4; The fractional equation ne SR bee 


oie ee ag ae for xER is transformed 
equivalently after multiplication by the rat common denominator 10-(x + 1)+(x — 2) for 

x + —I1 and x + 2 into the quadratic equation 9x? — 39x +- 12 = 0. The solution set of the 

fractional equation is § = {1/3, 4}. 


Example 5: Check: 
Vie+2+ VQx+7)=4 |squaring for x,: (21 + 2+ y(2-21 + 7)=4, 
x+2+ (2x + 7) = 16 dp ase false 
V(QQx + 7)= 14—x synnns for x2: V9 +2+ y(2:9+7)=4, 
2x + 7 = 196 — 28x + x 4=4 true. 
— Wx + 189=0, 
x,=2l, 
x2,=9. 


Here squaring was an inequivalent transformation. As the check has shown, only x, = 9 is a 
solution of the initial equation; hence S = {9} 


Not every equation involving square roots iets to a quadratic equation. It is always possible 


to get rid of all roots with integral exponents; an example is the equation V(x + 7)? — Vx +7)=6 
which for x € C has the solution set S = {— 15, 20}. 


Applied problems leading to quadratic equations. 


Example 1: Echo soundings. To measure the depth of the sea bed one uses echo soundings. 
The source of sound is situated at A, the receiver at B (Fig.). The width of the ship is 52.5 ft. Sound 
is propagated in water with a velocity of 4956 ft/s. During the time measurement the ship is con- 
sidered to be at rest. What is the depth of water for a time difference of 0.1 s? 

Let x ft. be the depth of water. The distance passed by the sound to the sea bed is (4956 - 0.1)/2 ft. 
By the theorem of Phythagoras one obtains 

x? = (247.8)? — (52.5/2)?, 
x* = 60715, 
12> +)60715.8 = 246.3. 
The depth of water is approximately 246 feet. The negative value has no physical significance. 
Example 2: Depth of a well. To determine the depth of a well one can drop a stone into it and 


measure the time from the beginning of the fall to the moment when one hears the stone hitting 
the water in the well. Suppose that this time is 4 seconds (Fig.). The velocity of sound is taken to 


4.3. Quadratic equations 95 


: | s=v(4-x) 


4.3-1 Echo soundings 4.3-2 Depth of a well 
be v = 1092.9 ft./s, and the acceleration due to gravity as g = 32 ft./s*. What is the depth of the 
water level below the rim of the well? — 

Let x s be the time up to the stone hitting the water; then it has covered a distance of 16x? ft. 
For the return passage sound has taken (4 — x)s, and during this time it has covered 
(4 — x) 1092.9 ft. Since the two distances are equal, one obtains the quadratic equation 

16x? = (4 — x) 1092.9 or x? + 68.3x — 273.2=0. 
One finds that the depth of the well is about 230 feet, because only the positive solution x, ~ 3.79 s 
of the quadratic equation has physical significance. 

Example 3: Hardness testing. In testing the hardness of a material by the pressure method 
developed by BrRINELL the impression 4 of a small steel ball of known diameter d = 2r in the 
material to be tested is calculated from the diameter 6 = 20 of the circular impression (Fig.). 
What is the depth of impression A when the diameter of the sphere is d = 2r = 10 mm and the 
ee impression is 6 = 20 = 6mm? - 

Let the depth of impression be A (in mm). By the theorem of Phythagoras one obtains 

= (r—h) + 0? or e 2rh + 9? = 0. Of the two solutions A, 2 = r+ y(r* — 07) only 

Dale \/(r? — 07) can be used, because for depths 4 > r the spherical impression always has 

the radius 0 = r and the method is not applicable. Substituting for r and o the given quantities, 
one obtains #4 = | mm. 


4.3-3 The Brinell hardness test 4.3-4 Section of a hollow sphere 


Example 4: Stereometric problem. A hollow steel sphere has the mass M = 160.72 lb. The 
thickness of its wall is w = 2.36in. (Fig.). What is its inner radius r and outer radius R if the 
density is @ = 0.28 pounds per cubic inch? — 

If the inner radius r has the length x inches then the outer radius is R = (x + w). The mass 
of the hollow sphere is M = (42/3) (R? — r°) 9, that is, in this case M = (47/3) [(x + w)? — x*Jo. 
This leads to the quadratic neni x? + wx + w?/3 — M/(4aw) = 0 with the solutions 
Xy,2 = —w/2 + y[M/(4ewx) — w?/12 

Here only x, = —w/2 + Viale) — w?/12] is a solution of the problem. The required radii 
are R = 5.52 in. and r = 3.16 in. 


Graphical solution of quadratic equations 
Standard parabola in parallel displacement. The zeros of the quadratic function ye x? + px+q 
or y= (x + p/2)? ++ (¢q—- p?/4) yield the solutions of the quadratic equation x? + px+ q= 0. 
The graph of this function is a standard parabola that has been subject to a parallel displacement 


96 4. Algebraic equations 


in the direction of the x-axis by —p/2 and in the direction of the y-axis by —D = +(q — p?/4), 
whose vertex V(x,, ¥,) therefore has the coordinates x, = —p/2; y, = q — p 214. 

According to the position of the vertex the standard parabola cuts the x-axis in two points (y, < 0), 
or touches it (y, = 0), or has no point in common with it (y, > 0); consequently the quadratic 
equation has two distinct, two coincident, or no real solutions. 

Intersection of a parabola and a straight line. The given equation x? + px + g = 0 in the form 
x? = —px — q is interpreted as a condition for the functions with the equations y = x? and 
y = —px — q to yield the same ordinates y for certain abscissae x. Geometrically this means to 
determine the points of intersection of the graphs of these functions. The abscissae of the points of 
intersection then give the solutions of the equation. For y = x? one obtains as graph the standard 
parabola, for y = —px — q a straight line. According as this straight line is a secant or a tangent 


to the parabola or has no point in common with it, one obtains two, one, or no real solutions of 
the equation. 


Example ]; x? — x —2=0. One transforms the equation into x? = x +- 2. 
One goes over to the function with the equa- The abscissae of the points of intersection of 
tion y = x? — x — 2. The graph is the dis- the standard parabola with the equation y = x? 
placed standard parabola with the vertex and the straight line with the equation y= x-+ 2 
Vi('/2, —2"/4). Here Ye< 0. The parabola are x; = —l and x3 = 2 (Fig.). 
intersects the x-axis at x, = —l and x,=2 The equation x?— x—2=0 has two distinct 
(Fig.). real solutions. Its solution set is S = {—1, 2}. 


4.3-5 Roots of a quadratic equation as abscissae of the 
biti fotstiters aces + ret paweeag se] bengd Biss intersection of the x-axis with a standard parabola subject to 


Lag aiden ae ina & parallel displacement 
pierestt e 


i CEE Sees 


jb ane Ba 7 eri aa | 4.3-6 Graphical solution of a quadratic equation 


with a fixed standard parabola 


Example 2: x* — 2x + 1 = 0. 
One goes over to the function with the equation One transforms the equation into x? = 2x — 1. 
y = x* — 2x + 1 whose graph is the displaced The standard parabola with the equation yu x* 
standard parabola with the vertex V,(1,0). and the straight line with the equation y= 2x — 1 
Here y, = 0. The parabola touches the x-axis touch each other. The abscissa of the poirit of 
at x, = x, = | (Fig.). contact is x; = x, = | (Fig.). 

The equation x* — 2x + 1 = 0 has two coincident real solutions. The solution set is S = {1}. 

Example 3: x* + x+2=0. 
The function with the equation y = x* + x -+-2 The transformed equation is x? = —x — 2. 
has as its graph the displaced standard parabola The straight line with the equation y = —x — 2 
with the vertex V;(— "hy 13/4). Here Yy > 0. does not intersect the standard parabola with 
The parabola does not intersect the x-axis (Fig.). the equation y = x? (Fig.). 


The equation x* + x + 2 = 0 has no real solutions. The solution set is § = @. 


4.4. Equations of degree three and four 97 


Historical remarks 


Practical needs, in particular, problems of mensuration (theorem of Pythagoras) led at an early 
stage to quadratic equations. Many such problems dating from Babylonian mathematics have come 
down to us in cuneiform tablets. Even systems of quadratic equations with several variables occur 
there. A problem dating from about 2000 B.C. in modern notation is: x? — 29x + 210 = 0 
of slightly later date is, for example, the system x? + y? = 1000, y = 2x/3 —10. 


The Greek mathematicians treated algebraic problems in geometric form, that is, by construction. 
Since a square root can always be constructed by ruler and compass, the Greek mathematicians 
were in a position to treat all types of quadratic equations having real solutions. The classical 
account of these methods is in Book X of the ‘Elements’ of Euclid (about 300 B. C.), which in 
its contents goes back to THEAITETUS (410?-368 B. C.). The Hellenistic engineer and mathematician 
HERON of Alexandria (about 100 A. D.) took up the Babylonian and ancient Egyptian tradition 
of numerical treatment of quadratic equations, using approximate methods of extracting square 
roots. Traces of this approach can already be found in the writings of ARCHIMEDES (2787-212 B. C.). 
The discovery that roots occur in pairs is due to the Hindu mathematicians, above all BHASKARA 
(born 1114 A. D.). Their methods found their way into Europe, with scholars writing in Arabic 
as intermediaries, who themselves made further progress. 


4.4. Equations of degree three and four 


In general, the higher the degree of an algebraic equation, the more difficult is its solution. There- 
fore, quite a number of graphical and approximate methods of solution have been developed for 
the practical task of finding numerical solutions, which make it possible to calculate solutions to 
an arbitrary number of decimal places. 


The cubic equation 


In the general form of the cubic equation or equation of the third degree x is the variable for 
which the set C of complex numbers is laid down as fundamental domain. A, B, C and D are real 
parameters. Ax? is called the cubic, Bx? the quadratic, Cx the linear, and D the absolute term. On 
dividing both sides by A + 0 and setting B/A = r, C/A = s, D/A = t one obtains the equivalent 
normal form x? + rx? + sx + t= 0. 


Solubility. Special cases. In the domain of complex numbers every cubic equation has three 
solutions, some of which may be coincident. Since every polynomial of odd degree has at least one 
real zero, one of the solutions is always real. The other two are either also real or conjugate complex. 
If x, is a real root, then (see Chapter 5.) the cubic function can be split into a product of the linear 
factor (x — x,) and a polynomial of degree two. Since a product vanishes if and only if one of the 
factors does, the two other solutions of the cubic equations are the solutions of the resulting quadratic 
equation. 

By the theorem of Vieta the product x, - x2 ° x3 of the three solutions is equal to the negative of 
the absolute term (—1?). Therefore, if it is known that the given equation has integer solutions, 
then a real solution x, can be found as a factor of (—1¢) by trial and error; for example, the cubic 
equation x? — 5x? — 8x + 12 = 0 has the solution x, = +1 and can be split into the product 
(x — 1) (x? — 4x — 12) = 0. Since the quadratic equation has the solution —2 and +6, the solution 
set of the given cubic equation is S = {—2, +1, +6}. 

The cubic equation x* + rx? + x = 0 in which the absolute term ¢ is zero, reduces by factorization 
to the equivalent equation x (x? + rx + s) = 0. Apart from the real solution x, = 0 the other two 
solutions of the given cubic equation are the solutions of the quadratic equation x? + rx + s= 0. 


The pure cubic equation x? a t= 0 arises for r= 0, s = 0. It has the three solutions x; = //—t, 


X2 = 2 y—t and x3 = 03 /—t, where w2 = (—1+i/3)/2 and w; = (—1 — i 3)/2 are the 
complex cube roots of unity. 

If in addition t = 0, that is, x> = 0, then only x, = x2, = x3 = 0 can be a solution, because for 
x + 0 one also has x? + 0, and vice versa. 


98 4. Algebraic equations 


Cardano’s formula. This formula to calculate the roots of the cubic equation is obtained in two 
steps. First, the normal form x? + rx? + sx + t= 0 is brought by the substitution x = y — (r/3) 
to the reduced form in which there is no quadratic term: 


| =. z = : - ss er. ee 
[ i i " PE | = ‘ 
=. = fe <7 eke 2 . 


Here the abbreviations p = s — r?/3, g = 2r3/27 — sr/3 + t are used; for example, the reduction 
of x3 — 9x? + 33x — 65 = 0 leads to y* + 6y — 20= 0. 

Next, the required solution y is put into two parts u and v, which will be determined separately. 
One sets tentatively y=u-+v and obtains (u+v)?+pluu+v)+q=0 or BW+u+4+¢4 
+ (u + v) Guv + p) = 0. One now has an equation in the two variables u and v. Therefore, one 
is free to use an additional condition on the connection between u and v. One chooses it so that 
the factor 3uv + p, and hence the last summand, vanishes; 3uv + p = 0. This yields a system of 
equations for the variables u and v. 


4 2+03=-¢ squaring u® + 2u3p? + p® = g? + 

| | four times the | 4u°v? = —4(p/3)°_ | — 

| "third power (u? — v3)? = q? + 4(p/3)> 
|The system of equations obtained yields “3 vi 3 a = + 4(p/3)") | 


u® = —q/2 + V[(g/2)? + (p/3)*) and v? = —g/2 = y[(q/2)* + (p/3)*). 


By interchanging the upper signs with the lower signs in the roots, u> goes over into v?; but when 
u and v are interchanged, the equations u* + v*> + q¢=0 and uv = —p/3 remain unchanged. 
Therefore it is sufficient to consider only one of the pairs of signs, say the upper. Every cube root 
of a complex number has three values; apart from one solution x, there are the other solutions 
2X, and w3x, in which w2 and w3 are the complex cube roots of unity. Consequently, for u and v 
one obtains the values 


3 

uy = V{—q/2 + V ((g/2)? Ss (p/3)7}}, U2 = W402, Uz = 403, 

71> Vi-—4/2 — V((q@/2)? a (p/3)*}} ’ 02 = 0192, 03 = 0103. 
For y = u; + v, one would obtain 9 solutions (i = 1, 2,3; = 1, 2, 3) of the cubic equation. But 
the number of solutions reduces to the following three; y; = u, + v1, yz = U2 + 03,93 = U3 + v2, 
because the additional condition u,v; = —p/3 is satisfied only for u,v,, u2v3 and u3v2, since 
W203 = (—1/2 + (i/2) 73) (—1/2 — G/2) V3) = 1/4 + 3/4 = 1. Under the assumption that the radi- 
cand of the square root is non-negative, (q/2)” + (p/3)* > 0, the solution y, is real, while y, and 
y3 are conjugate complex, as the calculation shows: 

Yo = UyW2 + 04W3 = —(Uy + ¥4)/2 + [GH — 0,)/2] -i1 V3, 
Y3 = U3 + VW. = —(u, + 0,)/2 — [4 — ¥4)/2]-i y3. 


a 


Example: y? — 15y — 126 = 0. The equation is in reduced form. Here p = —15andg = —126. 
Substitution into Cardano’s formula gives 


Yi = V163 + (63? — 5%)] + [63 — y(63? — 5%] = y125 + yl =5+1=6, 
y2 = —[(5 + 1)/2) + (5 — 1)/2) -i V3 = —3 + 2iy3, 
Ys =—[(5 + 1)/2] — (5 — 1)/2]-i 73 = —3 — 213. 

Hence S = {6, —3 + 2i 3, —3 — 2i //3}. 


Casus irreducibilis, trigonometric solution. Apparently the solution of the cubic equation becomes 
particularly difficult when the radicand (q/2)? + (p/3)* of the square root is negative. Then one 
has to extract the cube root of complex numbers. On the other hand, a cubic equation always has 
at least one real solution. For a long time the mathematicians in the 15th and 16th centuries did 
not succeed in producing this real solution and called this case, which went beyond their means, 
as ‘not reducible’, as casus irreducibilis. ViErA succeeded in obtaining the solution, around 1600, 
by means of trigonometry. In fact, it turned out that in this apparently so complicated case all 
the three solutions are real. 


4.4. Equations of degree three and four 99 


Since (g/2)? + (p/3)? <0, one must have p <_0; setting p = —p’ one has p’ positive, and the 
reduced equation y* + py + q=0 goes over into y* — p’y + q = 0, where p’3/27 — q7/4>0. 
The radicand of the cube root of uw; or v,; is then: 

—q/2 + V(—p’?/27 + q?/4) = —q/2 + V—(p’?/27 — q?/4) = —q/2 + i ¥(p7/27 — g?/4). 
This complex value can be written in trigonometric form: —g/2 + i V(p’3/27 — q?/4) 
= r(cos g + isin »), where 

r= V(p3/27), cosy = —q/2: V(p3/27), = sing = V(p'3/27 — q?/4): V(p’3/27). 


3 
By de Moivre’s theorem one obtains for u, or v,: /r(cos g/3 + i sin g/3) 
3 


3 
and so: y; = uy + v, = Yr{cos p/3 + isin g/3 + cos y/3 — isin y/3] = 2 yr- cos y/3. 
Since on account of the periodicity of the cosine function the angle can also have the value y + 360° 
or g -+ 720°, the other two solutions are: 


3 3 
yz = 2yr- cos (y/3 + 120°) and y3 = 2 yr: cos (p/3 + 240°). 


Cubic Ax? + Bx? +Cxr+D=0 xec:A,8,C, DER: A+O0 
equation 


r= B/A;s= C/A: t= D/A 


r2/3 
4/27 — rs/3 +1 


| (q/2)* + (p/3)? > 0 . ive, om 4 — pi? <4. ‘ict aa a7 
formula one real solution and two con- ; ' _ fon ie M 
| jugate complex solutions; for vo, = V[—9g/2 — y(q*/4 + p°/27)) 
(>)? 4. ay — | ene | 
(g/2)° + (p/3) 0 alee ce Ya9= —(u +04)/2 
arm solutions of which two + [(u; — v,)/2)-i V3 
casus | (q/2)? + (p/3)° <0 r= V(—p?/27) | 
| irreducibilis | three distinct real solutions cos p = —(g/2): ¥(—p?/27) 
| = 2) rcos @/3 
4 
r cos (p/3 + 120°) 
rcos (p/3 + 240°) 


=2 
+ 
Ps 


\ 
5 
} 


Example |: In the equation y? — 98ly — 11340 = 0 the conditions of the casus irreducibilis 
are satisfied, and one obtains r = )/327°, cos p = 5670/ y327°. The logarithmic calculation yields 
the (approximate) value g = 16° 30’, hence g/3 = 5° 30’. 

By logarithmic evaluation of the formulae for y,, y2, ¥3 one finds 
y, % 36, y2 = —21, y3 ~ —15. The check shows that equality holds. 
Therefore § = {—21, —15, 36}. 

Example 2: The axial cross-section of a normed glass funnel is an 
equilateral triangle (Fig.). What is the width d of the funnel if its 
volume is V = 765 cm*? — Since the width d is one side of the axial 
cross section, the height of the funnel is 4 = (d/2) 3 and the radius 
of the base r = d/2. From the volume formula for the cone V = 
(1/3) zr? - A it follows that 
765 cm? = (1/3) 2+ (d/2)? + (d/2) V3 or d* = 765 - 24/(a 3) cm*. 

Of the three values for d here only the real value is meaningful. One 4.4.) Normed glass funnel 
obtains for the width of the funnel d= 15 cm. 


Graphical solution of a cubic equation. From the cubic equation Ax* + Bx? + Cx+D=0 
one goes over to the function of the third degree with the equation y = Ax? + Bx? + Cx + D. 
The graph of this function intersects the x-axis in points whose abscissae give the solutions of the 
cubic equation. One obtains approximate solutions, which can be improved to any required degree, 
for example, by Newton’s method. As a rule one is satisfied in finding one solution x; graphically, 
and then to divide the given cubic polynomial by the linear factor (x — x,), so that one obtains 
a quadratic equation which can easily be solved. 


100 4. Algebraic equations 


Example: From the equation 8x* — 20x* — 2x+5=0 
one goes over to the function with the equation 
y = 8x3 — 20x? — 2x + 5 (Fig.). Its graph can be drawn 
by means of the table 


=i ier igh 1 ./ 4 
—71 


in such a good approximation that one would expect zeros 
to lie at x; = —1/2, x2 = +1/2, x3 = +5/2. Here the choice 
of different units on the y-axis and on the x-axis has no in- 
fluence on the position of the zeros. Incidentally, the table in- 
dicates even before the graph is drawn, by the change of signs 
of the ordinates, that the zeros must lie between —1 and 0, 
between 0 and 1, and between 2 and 3. 

By substituting the values into the equation one finds that the 
expected zeros are, in fact, solutions, But it would be sufficient 
to check for x, = 1/2, say, and after division of polynomials 
(8x3 — 20x? — 2x + 5): (x — 1/2) = 8x? — 16x — 10 ae | 
to solve the quadratic equation 8x* — 16x — 10 = 0 whose pate Graphical solution of the 
solutions are x;.3 = 1 + 3/2. This is a verification of the solu- a oe iG 
tions obtained graphically. Here S = {—1/2, +-1/2, +-5/2}. ~ 


Historical remarks. Simple cubic equations already occur in the ancient Greek, Hindu and Arabic 
mathematics. Since the Greek mathematicians treated algebraic problems by the methods of geometry, 
they were facing fundamental difficulties in the treatment of cubic equations. 

The Hellenistic technologist and mathematician HERON of Alexandria (about 100 A. D.) took 
an important step forward in the treatment of cubic equations. By reverting to older Babylonian 
and Egyptian approximation methods for the numerical extraction of roots he succeeded in solving 
pure cubic equations. The real progress in the numerical treatment and the commencing algebraiza- 
tion of the computational steps is due to Hindu and above all to Arabic mathematicians. They 
could solve numerically all types of quadratic and the simplest types of cubic equations, but they 
did not succeed in solving the general equation. The European mathematicians followed immedi- 
ately in the footsteps of the Arabs, also in the numerical treatment of equations. But Luca PACIOLI 
(1445-1514), who has very great merits in the development of algebra, thought it impossible to 
find a formula for the algebraic solution of the general equation of the third degree. 

This was achieved around 1500 by Master Scipione del FERRO of Bologna (about 1465-1526), 
but remained unpublished. Quite independently, Niccolo TARTAGLIA (about 1500-1557), mathe- 
Matician and ballistic engineer, had found the formula, which today is named after CARDANO, 
and had achieved considerable fame by applying it brilliantly in public problem solving contests 
which were customary at the time. The ambitious Professor Geronimo CARDANO (1501-1576) of 
Venice, who did not succeed in finding the formula for the solution, obtained it from TARTAGLIA 
in 1539 after years of intense pressure, but he had to swear a solemn oath to TARTAGLIA to treat 
it as a kind of professional secret. However, CARDANO broke his promise and included the result 
in his ‘Artis magnae sive de regulis algebraicis’ (that is, of the Great Art of the Rules of Algebra) 
of 1545. And since the formula appeared in print for the first time under CARDANO’s name, it became 
known as Cardano’s formula. Even Tartaglia’s protest, which led to a violent quarrel, was of no 
avail. Incidentally, also the well-known method of universal suspension, for example, of a ship’s 
compass, bears Cardano’s name wrongly: it was in use long before him. 


The quartic equations 


There is also a general solution formula for the general equation of the fourth degree. However, 
this is much more complicated than that of the cubic equation and is therefore hardly used in the 
numerical determination of the solutions. Hence only a sketch of the method, without details of 
the calculation is given here. By the substitution x = y — a/4 one obtains from the normal form 
x* + ax3 + bx? + cx + d=0 the 
reduced equation y* + py? + qy + r= 0 with new coefficients p, q, r. 

Its four solutions ¥;, ¥2, ¥3, ¥4 With 2y, = ¥zy + V22 + W723; 2¥2 = V2 — V22 — V233 


2y3 = —Vz, + V22— V233 2y4 = —V21 — V22 + V23 


4.5. General theorems 101 


can be obtained from the three solutions z,, z2, z3 of the cubic resolvant of the given quartic 
equation: 

z> + 2pz? + (p? — 4r)z— q? = 0. 
An additional condition is that the product of the three solutions z,z.z3 = q* must always be 
positive. For the solutions of the reduced quartic equation in the domain of complex numbers the 
following three cases can arise: 


solutions Z,, Z2, 23 of the cubic resolvant solutions y1, ¥2, ¥3, ¥4 Of the quartic equation 
all real and positive four real values 

one positive, two negative two pairs of conjugate complex values 

one real, two conjugate complex two real values and one pair of conjugate 


complex values 


The biquadratic equation. A special case of the quartic equation occurs frequently and is easy 
to treat. This is the biquadratic equation x* + px? + q = 0. It is distinguished by the fact that 
the variable x only occurs to even powers. The equation can therefore be regarded as a quadratic 
in x7: hence its name. For y = x? one obtains y? + py + q = 0. One solves the quadratic equation 
for y. By a subsequent solution of the equation x? = y one obtains the solutions of the biquadratic 
equation. 

Example: x* — 29x? + 100 = 0. 
¥i w= 25, yg = 4; 4 = +5, » —5, xX, = 2, a = —2: 
S = {—5, —2, +2, +5}. 

Historical remark. The formula for the solution of the general quartic equation was found by 
L. FERRARI (1522-1565), a pupil and collaborator of CARDANO. The formula was taken by CARDANO 
into his ‘Ars Magna’. 


4.5. General theorems 


Fundamental theorem of algebra. The conjecture that in the domain of complex numbers an 
equation of degree 2 always has 7 solutions was made already by GIRARD (1595-1632). Attempts 
to prove this were made later by DESCARTES, Jean D’ALEMBERT and others, but it was only GAUSS 
who succeeded in 1799 in giving a rigourous proof without gaps, which formed the main topic of 
his dissertation; later GAUss found other independent proofs for the fact that an algebraic equation 
always has a solution. 


Fundamental theorem of algebra: Every equation of degree a 
x" ayx®-! + ayx*-? + 0. + ay xX, + a, = 0, 
in which the a, (i = 1, 2, ..., m) are real or complex numbers, has at least one solution in the domain 
of complex numbers. 


Factorization. If a solution of the equation x” + a,x"-1 + a,x"? +... + a,_1x + a, = 0, which 
is guaranteed by the fundamental theorem, is denoted by x, and if one subtracts the equation 
x" + ayxt—) + agxh-2 4 +» + dy_1X1 + 4, = 0, which is obtained from the given solution by 
substituting x = x,, one obtains (x" — x%) + a,(x""! — x¥-1) + --- + aq_1(x — x1) = 0. Every 
term contains the factor (x — x,), hence it follows by factorization that 

(x — xy) [x""* +++ + ay_1] = 0. 
The expression in the square bracket on the left-hand side is a polynomial of degree n — 1. By 
the fundamental theorem this also has a solution. If it is denoted by x,, then the factor (x — x2) 
can be split off. One obtains (x — x,) (x — x2) [x"-? + --- + a,_2] = 0. If this method is continued, 
one obtains finally, the product representation. 


Number of solutions. An important theorem follows immediately from the product representation: 
an equation of degree 7 in one variable always has exactly m solutions. These need not all be distinct 
from each other. 

If a solution occurs 2, 3, ...,k times, one speaks of a 2-, 3-, ..., k-fold solution or root. If the 
coefficients of the equation are real and if the equation has a complex solution a + ib, then the 
conjugate complex number a — id is also a solution of the equation. 


102 4. Algebraic equations 


Vieta’s root theorem. If one 
multiplies the right-hand side of 
the product representation and 
orders by equal powers of x, a 
comparison of coefficients leads . 
to Vieta’s root theorem. an oe 


In addition the following theorem holds: 


If an equation of degree n with integer coefficients in its normal form has an integral solution, 
this is a divisor of the absolute term. 


For quadratic and cubic equations Vieta’s root theorem takes the following form: 


= i ; = — i we - 
a Wes #7 


Solubility by radicals. The fundamental theorem of algebra guarantees the existence of the roots 
of the equation x" + a,x"-1 + a,x"? +.--- + a,_1x + a, = 0 for all degrees. For n= 2,3,4 a 
general formula for this solution can be written down. For 2 = 3 it consists in a succession of roots, 


3 
one contained in the other; the solution is of the type /(a + Vb); for n= 4 it is of the type 


3 
Via + y[b + V(c + Yd)]}. By a radical one understands an expression formed by superimposing 
roots whose exponents are positive integers. Using this notion one can say: 


Algebraic equations of up to the fourth degree are soluble by radicals. 


Evidently there is an unexhaustable variety of radicals, and one should think that somehow by 
a combination of superimposed roots the solutions of an equation of degree 5 could be obtained. 
But this is not the case. On the contrary: it is impossible for n > 4 to solve the general algebraic 
equation of degree n by radicals. 


Historical remarks. After the solution formulae for the cubic and quartic equation had been 
found during the Renaissance, the mathematicians of the 17th and 18th centuries searched with 
great tenacity for corresponding solution formulae for equations of degree 5 and higher. In fact, 
some mathematicians, among them von TSCHIRNHAUS (1651-1708), believed that they had proved 
the possibility of a solution by radicals. 

But gradually it was recognized that a solution of the general equation of higher degrees by radicals 
might be impossible; this was an opinion expressed by LAGRANGE and by Gauss. After an attempted 
proof (1799) by RUFFINI, which contained gaps, Niels Henrik ABEL (1802-1829), a brilliant mathe- 
matical genius who died young from tuberculosis, succeeded in proving that the general equation 
of degree 5, and hence also equations of higher degrees, are not soluble by radicals. One of the 
reasons why it was so difficult to get an understanding of the solubility situations of equations 
of higher degree was the fact that special equations of higher degree can very well be soluble by 
radicals. A precise and complete survey of all equations of all degrees that are soluble by radicals is 
given in Galois’ theory. Evariste GALoIs (1811-1832) started out from the results obtained by Gauss 
on the problem of cyclotomy (division of the circle). He was a man of genius and an ardent repu- 
blican. Like PUSHKIN in Tsarist Russia, he was mortally wounded in a duel in which his opponent 
was possibly an agent provocateur of the reactionary monarchist police. 

In Galois theory a group is assigned to every equation; its structure gives information on whether 
the equation is soluble by radicals (see Chapter 16.). 


4.6. Systems of non-linear equations 


Certain types of systems of non-linear equations occur rather frequently, for example, in coordinate 
geometry or in connection with systems of ordinary differential equations. Here a few cases are 
selected from the multitude of systems of non-linear equations. A systematic treatment is not pos- 
sible. In what follows the fundamental domain for all variables is R. 


One linear and one quadratic equation. By the method of substitution the system is easy to solve. 
It occurs, for example, when the points of intersection of a conic with a straight line are to be deter- 
mined. 


4.7. Algebraic inequalities 103 


(- amet RE + 4(— Ae Sg 
“gi —y hp? — y_ 2 = 0 | 

<3} 4600000, 2 yz = -!1 
The check confirms that these values are solutions. § = {(—3, 2), (0, —1)}. 


Two quadratic equations. This problem occurs when two conics are intersected. If there are no 
mixed quadratic terms and if in the two equations the corresponding coefficients of the pure quadratic 
equations are equal apart from a constant factor k + 0, then after multiplication by 1/k and sub- 
traction can be achieved the quadratic terms are absent and a linear equation results with the help 
of which one variable can be eliminated by substituting into one of the quadratic equations. 


Example !: 

x? + y? — 18x — 18y + 112 = 0 | ——» x? + y* — 18x — 18y + 112 =0 
x?/2 + y?/2 — 11x + Sy —52=0 | -2—» x? + y? — 22x + 10y — 1044 =0 
—_ 4x — 28y + 216=0 


y? — 18y + 80=0 = Ty — $4 
y,; = 10; 6 —— ee x. =2 7% 


The check confirms that S = {(2, 8), (16, 10)} is the solution set of the system. 
Example 2: (1) x?+y?=a 
Method I. J\2 x-y=65 N Method II. 
(x+y =a +26 |) +2-Q) oda) by? +P ma 
(x — y)? = a — 2b | (1) — 2-(2) (2a) x = bly 
y* — ay* + 5? =0 


x+y=+V(a + 2d) 


x— y= +y)(a — 2b) biquadratic equation 
Ya = £(1/2) [V(@ + 25) — Va — 28) 
x1, * = £(1/2) [Va + 26) + Va — 28)} yh.a = a/2 + (1/2) Va? — 462) 


ya = (1/4) [a + 2b + a — 2b + 2 Ve — 4b2)] 4 X7,2 = a/2 + (1/2) V(a? — 4?) 
x}.2 = (1/4) [a+ 2+ a —2 =F 2 V(@?— 467)] 
One sees that the two methods lead to the same result. 


Three quadratic equations in three variables. A special system of equations of this kind arises by 
the problem in coordinate geometry of finding the equation of a circle through three points say 
P, = (—8, 12), Pz = (—4, 4), P3 = (9, —5). Required are the coordinates of the centre C(a, b) 
and the radius r of the circle. One obtains the system 

(—8 — a)? + (12— 5b)? =r’, 

(—4—a)?+(4—5b)? =F’, 

(9—a)* +(—S5— 5b)? =r’ 
for the variables a, b and r. Evaluating the squares and subtracting say, the second equation from 
the first, and the third from the first, two linear equations in the variables a and b are obtained, which 
lead to the values a = 16 and b = 19. Substituting the calculated values for a and b into one of the 
original equations one obtains a pure quadratic equation for r. Its positive solution is the required 
radius, in the present case r = 25. 


4.7. Algebraic inequalities 


The notion of an inequality, like that of an equation, is defined by means of the concept of an 
expression. If two expressions E, and E, are ear by one of the relation symbols > ‘greater than’, 
= greater than or equal to’, < ‘less than’, < ‘less than or equal to’, or += ‘unequal to’, then 
there arises one of, the inequalities E, > £2, FE, 2 E2, Ey < Ea, E, <E2, or E; + E3: for 
example, 3x < 5, a >9,2<8, x+y> 6, 1/2 + 1/3 are inequalities. The only inequalities to 
be treated in what follows are of the forms Ei > E, and E, < E32. 

Just as for equations, so one distinguishes among inequalities between those without variables, 
which are propositions on inequality that can be true or false, and those that are predicates on 
inequality; for example, 2 < 8 and 1/2 > 1/3 are propositions, while a? << 9 and x + y > 6 are 
predicates. 

Solution set and solution of an inequality. Every number from the domain of definition which on 
substitution for the variable makes an inequality with one variable into a true proposition is called 
a solution of the inequality. Here the domain of definition of an inequality is defined by analogy 
to that of an equation. If the inequality contains two, three, ..., 2 variables, then a solution is an 


104 4. Algebraic equations 


ordered pair, triple, ..., m-tuple of numbers. The solution set S is the set of all solutions of an in- 
equality relative to its domain of definition. For example, the inequality x < 4 for the set N of 
natural numbers has the solutions 0, 1, 2, 3, that is, S = {0,1, 2,3}; but for xe Z one has 
S = {..., —3, —2, —1, 0, 1, 2, 3}. For the inequality xty< 2, xéEN, y EN, the solution set is 
5s = {(0, "0), (1, 0), (0, 1)}: for xe Z, ye Z the solution set of this inequality consists of infinitely 
many solutions, namely all ordered pairs of integers for which x + y < 2, for example, (—5, 1) 
or (1, —4). 


Consistent, inconsistent, and universally valid inequalities. One speaks of a consistent, respectively, 
inconsistent inequality according as the inequality with variables has, or does not have, solutions 
relative to its domain of definition. 


x < 0 for xE€Z: S = {...; —3, —2, —1} 
a* > 0 for aeEN: S = {1, 2, 3,4, 5, ...} a@<OforaeN:S=0 
2x > 3x for xe R: S={x|xeRandx< 0} | 2x >3xforxeN: S=@O 
y+3< y+4foryeR:S=R y+3<y+3foryeR:S=0 


Here the inequality y + 3< y+ 4 for y ER is not only consistent, but even universally valid, 
because all yER are solutions. A consistent equation with n variables is called universally valid 
if all ordered n-tuples of numbers from the domain of definition are solutions of the inequality; 
for example, the so-called triangle inequality |a + b| < |a| + |b| is satisfied for all pairs of real 
numbers, hence it is universally valid for aE R, DER. 


Inconsistent inequalities 
x < OforxeN: S= @ 


Equivalent inequalities. Two inequalities with variables are said to be equivalent if they have 
the same domain of definition and the same solution sets; otherwise the inequalities are called 
inequivalent; for example, x + 4< 7 and x < 3 are equivalent relative to the set N of natural 
numbers, for the solution set of both inequalities is S; = {0, 1, 2}. Similarly —2a > 4 anda < —2 
for a € Z are equivalent inequalities, because they both have S, = Sz = {..., —6, —5, —4, —3}=S 
However, the inequalities y > 0 and y > —2 are equivalent over N, but not over Z. Transformations 
carrying an inequality into an equivalent one are called equivalent transformations. They are based 
on the fundamental laws of arithmetic, especially on the monotony properties of real numbers. 


Propositions on equivalent transformations of inequalities with variables. The following inequa- 
lities are equivalent to E, < E): 
1. E,; < E, where E, and E, as well as E, and E, are equivalent expressions; 
2. E, > E; 
3. E,; + E,; < E, + E;, provided that the expression E£, is defined in the entire fundamental 
domain of variability; 
4. E,-E, < E,- E; and E,/E,, E,/E,, provided that the expression E, is defined and positive 
in the entire fundamental domain of variability; 
5. E, - E, > E,+ E, and E,/E, > E;/E,, provided that the expression FE, is defined and negative 
in the entire domain of variability. 


Solving inequalities. The task of solving an inequality is that of determining a// solutions relative 
to given fundamental domains of variability, in other words, of finding their solution sets. As for 
equations, so for solving inequalities the point is to carry out a sequence of suitable equivalent 

transformations and to arrive finally 


p2x + 2+ 3x< 3x—8+4: xeER at an inequality so simple that its 


\ Proposition | solution set can be read off. Fre- 
pox +2<3x—4 | —3x—2 quently, especially in estimates, one 
4 Proposition 3 uses the transitivity of the relations 
pox + 2— 3x —2< 3x —4— 3x —2 < or >, which make it possible to de- 
\ Proposition | duce from EF; < E2 and E,< E; 
~ ax <. —6 that E, < £3. For a linear ine- 
4 | Proposition 4 quality with one variable there is a 
» 2x/2 <= —6/2 1 22 method of solution by utilizing the 
\ Proposition | transformation theorems. 

x<-—3 The solution set consists of all real 


numbers that are less than —3: 


S = {x|xeER and x < —3}. This solution set can be illustrated graphically on the number 


line (Fig.). 


Check: For inequalities, unlike for equations, it is not possible, in general, to check the correct- 
ness of the calculations by substituting all solutions for the variables. But it is advisable to make 


tests for individual elements of the solution set, in the example above, for 


—SER, say: 


4.7. Algebraic inequalities 105 


A—5)+ 2+ X-N< H-IN—8+4 The test can also be carried out completely by writing 
—j04+-2—15 <—15—8-+4 all elements of S in the form x = —3 — h(A > O), sub- 
<-19 stituting in the given inequality, and checking whether the 


—23 resulting inequality proposition is true for all real h > 0. 
i a a a a PO CO aE er 
4.7-1_ Graphical representation of the solution 4.7-2. Graphical representation of the solution 
get S = (x |x €R and x < —3} set S = {a|a€ Nanda > 2} 
Example |: The solution set consists of all 
25 — 3a + 2a — 25 < 22 — 2a + 2a — 25 a om ee 
ig. 
—a <-2 | -(-1) 
a =e a 
Example2: yix<4;yeN,yeN | —x The solution set is S = {(0, 0), 


(1, 2), (2,0), (2,1), @, 0)} (Fig.). 


If the domain of variability for x and y is chosen to be R, then 
the coordinate pairs of all points of the half-plane below the 
straight line with the equation y= —x-+ 4 are solutions of 
the inequality. 


“2 0 +2 


4.7-4 Graphical representation of the solution set of the inequality 
x*§—4>0;xER 


4.7-3 Graphical representation of the solution set of the inequality 
x+y< 4 for x€N, y€N and for x€R,yER 


Example 3: 
x?7—4 >0O;xeER, A product is positive if and only if both factors have the same 
(x — 2)(x + 2)> 0. sign. This leads to two cases: 
First case: Second case: 
x—2>0 and x+2>0 x—2<0 and x+2<0 
x >2 and x >--—2 x <2 and x <-—2 
x >2 x <—2 


The solution set therefore consists of all real numbers that are greater than 2 or less than —2 
(Fig.). 


Example 4: x* — 4 <0; xeER, A product of two factors is negative if and only 
(x — 2) (x +2)< 0. if the factors have opposite sign. This leads to 
two cases: 
First case: Second case: 
x—2<0 and x+2>0 x—2>0 and x+2<0 
= <2 and x >-—2 x >2 and x < —2 JReOnSISeRY 
—2 <x<2 


The solution set therefore consists of all real numbers in the interval between —2 and +2: 
S = {x|xeRand —2< x < 2}. 


Example 5: (x + 2)/(x — 1) > 4; xR. The domain of definition of the inequality is the set 
of all real numbers x + 1. In fractional inequalities one has to make case distinctions: 


106 4. Algebraic equations 


First case: Second case: 

(x+2)/(x—1)>4 and x—1>0 (x+2)/(x-1)>4 and x—1<0 
x+2> 4x — 1) and x>1 x+2< 4x — 1) and x<l 
x+2>4x-—4 and x >| x+2<4x—-—4 and x < 1 

6 => 3x and x> 1 6 < 3x and x< l 

x <2 and x>1 x ae and x<l 

S, = {x|xeRand 1 < x < 2} inconsistent S, = ©. 


The solution set of the given inequality is S = S, ~ S; = S,, it consists of all real numbers x 
in the interval 1 < x < 2. 


Example 6: \a+ 5|< 2; a@€Z. By definition of the absolute value: ja + 5| = a + 5 for 
a+520, or |ja+ 5| = —(a+ 5) for a+ 5<0. One has therefore to distinguish between 
two cases: 


First case: i: Second case: 

a+5<2 and a+5>0 —(a+5)<2 and a+5<0 

a <—3j and a =-—5 —a-—5 <2 and a =-—5 

S; = {—5, —4) —a <7 |*(-1) anda <-5 
a >--7 and a <= -—5 
S, = {(—6, —5}. 


This leads to the solution set for the initia] inequality S = S, ~ S; = {—6, —5, —4}, which in 
this case can easily be checked by substitution. 


Example 7; It is required to compute the maximal error of the quotient a/b from the true values 
a and b of certain physical quantities, the measured values « and f of these quantities, and the 
measuring errors £, and ¢, for a and b, respectively. Let |8| > e,; by assumption |a — «| < e, 
and |b — ind) < €,. Then: 
a/b — a/B = (aB — bx)/bB = [B(a — x) — a(b — B))/bB 
|a/b — x/B| = |[B(a — ~) — a(b — B))|/|bB| < [|| |a — «| + || |b — BI)/(\d| |B) 

< [|B] e1 + || e2)/((d| |B). 

Since |B| > e2 is |b] < 2 + |B], hence 
la/b — «/B| < [|B] e, + || €2) /(\B\(/8| + €2)) 


is the maximal error of the quotient. 


Special inequalities. Throughout the fundamental domain of variability is the set of real numbers. 


5.1. Basic concepts 107 


5. Functions 

5.1. Basic concepts ..............60.. 107 Zeros and poles of rational functions 126 
Concept of a function ............ 107 The behaviour of rational functions at 
Representation of functions ........ 108 VIMO 4 ncsceirad oheeseco reece 127 
Special types of function .......... 111 Decomposition into partial fractions 128 
Inverse of a function ..........4406. 113 ; 

; 5.3. | Non-rational functions ........... 130 

5.2. | Polynomial and rational functions 115 Root functions ......00..cccceeee 130 
The concept of a rational function .. 115 Exponential functions ............ 131 
Linear functions er 115 Logarithmic functions ............ 133 
Quadratic functions ............+. 116 Trigonometric and circular functions 133 
Cubic functions sete ee ees beeen eee 118 Hyperbolic functions ............. 134 
Power functions with positive expo- The inverse functions of the hyper- 
mentS ......+.- tte e eee e eee eeeee 119 bolic functions 1.0.0... 00. ceuccees 134 
Polynomial functions ............4. 120 
Factorization of polynomials ...... 120 5.4. Functions with more than one inde- 
POS si scaiawcteeeendiiee eyes ee 121 pendent variable ................ 136 
The behaviour of polynomial functions General definition ..........c0008. 136 
QUINNIIN: 6b ehewenwaddeiwechaeas 124 Real functions with two indepedent 
Power functions with negative expo- VONIOUIOS 0. 2d been ae eeened wae 136 
NOUS: ghd sencnsaueewaeneseaes 125 Real functions with n_ independent 
General form of rational functions .. 126 DONIADICS: ob 830556426 4S S5-acsmeees 138 


5.1. Basic concepts 


Concept of a function 


In accordance with a definition, which EULER had already given in 1749, a function is often 
explained as a variable quantity that is dependent upon another variable quantity. For many purposes 
such a definition of the concept of a function suffices. But in the course of the further development 
of mathematics it turned out to be necessary and useful to give a more general and abstract content 
to the concept of a function. The essence of the concept is not the dependence of quantities, by 
which one usually understands numbers that can be compared in a ‘less than or greater than’ 
relationship, but the fact of the correspondence itself, on the basis of which certain objects are 
regarded as being assigned to certain other objects. The concept of a function is reduced to set- 
theoretical definitions. 


Correspondences. Every metal bar alters its length when heated. Suppose, for example, that a 
copper bar has a length of /) = 200 units u of length at 0°C, say centimetres or inches, then its 
length / at a temperature f°C is given by / = 200(1 + 0.000 016r). By this formula each value of t 
between 0°C and 100°C is made to correspond to a certain length / between 200u and 200.32u. 
Similarly, to each quantity of a merchandise there corresponds a certain sum of money as its selling 
price, a to each page number in this book, a number stating how many letters occur on the page 
concerned. 

Correspondences exist not only between numbers, but more generally between elements a in a 
set A and elements 5 in a set B; for example, each seat for a performance in a theatre corresponds 
to an entrance ticket or to a particular visitor. Thus, the 
correspondence is determined by a relation F defined on domain of definition range 
AwB (see Chapter 14.) with domain of definition 
D(F) G A and range R(F) C B. If with respect to this 
relation F one and only one element 5d of its range R(F) 
corresponds to each element a of its domain D(F), then 
the relation is said to be single-valued and one speaks 
of a function or mapping from the set A into the set B 
(Fig.). The element 5b of the range corresponding to 
the original element a of the domain is called the image 
of a. Consequently the function F is a set of ordered 
pairs (a, b) whose first element belongs to the domain 
of definition D(F) and whose second element belongs 
to the range R(F). For a mapping of A into B one 
has D(F) = A; that is, every element a€ A occurs as 5.1-1 Graph of a function 


108 5. Functions 


an original element, and for a mapping of A onto B, in addition, every element 5 € B occurs as 
an image. 

The element y that is assigned to the element x by the function f is often denoted by f(x) and the 
correspondence is then written x > y = f(x), or more briefly y = f(x). The element x is called 
the argument and the corresponding element y the function value f(x) at the point x. The domain 
of definition (or just domain) of the function x + y = f(x) is denoted by X and the range by Y. 
If fis a function from A into B, then clearly X © A and YC B. 


A function fis a mapping from a set A into a set B, that is, a non-empty set of ordered pairs 
(x,y) ef with xe X C A, ye Y © B and with the property that to each x ¢ X there corresponds 
exactly one ye Y. 


Representation of functions 


To describe a function one must give its domain of definition and its range and the rule for the 
correspondence. 


Graph. In the graph of a function the do- Womain of definition 
main and the range are represented diagram- _ 
matically and the correspondence is indicated 
by arrows (Fig.). Only one directed line goes 
out from each element of the domain, but one 
or more of these lines may lead to any one 
element of the range. 


5.1-2 Table of values of a jauetian 


Table of values. The rule for the correspondence can also be set down in a table of values (Fig.) 
rather than by means of a graph. The elements of the domain are entered in the top line of the 
table and under each one is the corresponding element of the range. A table of values can give only 
finitely many ordered pairs; it is not sufficient for the complete description of an arbitrary func- 
tion F. 


Explanation in words. If the domain and the range of a function are not finite or are so extensive 
that it is no longer possible to represent the graph or the table of values on a sheet of paper, then it 
is sufficient to give an exact description of the domain and the range, together with a rule by which 
for every element of the domain the corresponding element of the range can be found. A function 
can be defined entirely without the use of mathematical symbols, by means of a sentence in everyday 
language; for example, a function is defined if to every first division match in the football league 
there corresponds the quotient of the number of entrance tickets sold and the number of inhabitants 
of the place where the match is played. This function can give a certain indication of the interest 
shown by the public in individual games. Many examples can be found of rules of correspondences 
that are formulated entirely or partly in words. 


Example /: To each real number x there corresponds either the value 0 or the value 1, according 
as x is irrational or rational. For example, )/2 + 0; (3/4) + 1. 

Example 2: g(x) = [x], where x denotes a real number and [x] denotes the greatest integer 
that is less than, or equal to, x. 


Diagram. A diagram likewise represents a function if one chooses a set of numbers of the horizontal 
axis as domain of definition and a set of numbers of the vertical axis as range, and assigns to the 
argument x of the domain precisely that value of y for which the point with the coordinates x, y 
is a point of the diagram. However, not every arbitrarily drawn curve in a coordinate system can be 
regarded as the representation of a function. The correspondence given by means of the curve must 
be single-valued. This is the case if the curve of the diagram is cut by each line parallel to the vertical 
axis in at most one point. 


Formula. The most frequently used method of representing a function in mathematics is the 
formula. In this the elements of the domain and range are now only numbers, or at least mathematical 
objects for which suitable rules of calculation can be given; for example, (1) y= 7x + 2; 
(2) y = V(x — 9); (3) y = sin x. If no particular information is given about the domain of definition, 
one usually regards those numbers as belonging to it to which a definite value can be ascribed by 
means of the formula. In the cases (1) and (3) these are all real numbers, and in case (2) all real 
numbers greater than or equal to 4. The range is then given by: (1) —o< y< +o; 
(2) O< y< +00; (3) -lxy< +l. 

Restriction of the domain of definition. The domain of definition can, however, be arbitrarily 
restricted, for example, (1)* y = 7x + 2 (for —3 < x <5) or (1)** y = 7x + 2 (for —8 << x <0), 
and so on. The range is then given by (1)* —19 < y < 37, and (1)** —54< y< 2. Here it is 
essential that, according to the definition of the concept of a function, (1), (1)* and (1)** represent 


5.1. Basic concepts 109 


entirely different functions. Because two sets are equal precisely if they have the same elements, 
two functions f, and f, are likewise equal precisely if each pair of elements (x, y) that belongs to 
fi, (x, y) Efi, also belongs to f2, (x, y)€f2, and vice versa. This is not the case for the functions 
(1), (1)* and (1)**. 


Example |]: If P is the sign for the price, p for the price of 100 gram of a certain merchandise 
and m the symbol for its mass in gram, then P = p~ m/100 is the connection between the mass 
and price of the material. Substituting 0.72 for p, and 100, 200, 300, ... for m in the formula, one 
obtains the values 0.72, 1.44, 2.16, ... for P. 

Example 2: In the formula / = /)(1 +- 0.000 016r), for the /ength of a copper bar when heated, 
/ and ¢ are symbols with another meaning; / stands for numbers in the domain of lengths, r for 
numbers in the domain of temperatures. The formula is valid if t assumes values between 0°C and 
100°C. 

Example 3: The calculation of the area of a square is made according to the equation A = a’. 
Here a is a symbol for the number of units of length of the side and A for the number of units 
of area. 


Abstracting from the special content of individual examples, one arrives at the following state 
of affairs: 

1. Variables are introduced for the elements of the domain of definition and of the range. In 
the examples above these are the symbols m, t, a and P, |, A. In mathematics one often uses the 
symbols x or y aS variables in functions, and the symbols f, g, gy etc. to denote functions. 

2. The rule for the correspondence is defined with the help of the variables by means of an equation. 
The element (y) of the range corresponding to an element (x) of the domain is obtained by first 
substituting for the variable x in the equation and then calculating y. For example, if the function 
is defined by the equation y = —2x? + 4x — x, with domain of definition 0 < x < +00, then 
the value corresponding to x = 9 is obtained by substitution: y = —2-92 + 4-9 — y/9 = —129. 
In this way the number —129 corresponds to the number 9 according to the given function. The 
value corresponding to every number of the domain of definition can be found in the same way. 

The symbol for the elements of the domain of definition is called the independent variable and 
that for the elements of the range of a function is called the dependent variable. An equation by 
which the rule for the correspondence defined by a function is given is called the equation of the 
function. 


Graphical representation. From the equation of the function one often arrives by means of a table 
of values at an intuitive representation of the function concerned. With the help of a plane co- 
ordinate system (see Chapter 13.) a point P of the plane is constructed to correspond to each 
number pair (x, y) and the totality of points P is called the graph of the function. According to the 
nature of the domain of definition and of the equation of the function, one obtains a sequence of 
isolated points, individual portions of curves or a connected function curve. 


Example: If x is the independent variable in the domain of definition —1 < xs +2, then 
the function with the equation y = x/2 has the range —1/2 <= y = +1. For individual values 
of x the accompanying table of values is obtained. 


+1/2 | - +2 
+1 


Individual points of the function 

curve in the domain —1 <x<= 2 | | Eats 

can first be drawn. If one calcu- i AeA} a Segal F356: pga Pega Rees aad J! 
lates the values of the function | | from the toble | 

for further arguments, one ob- | | ofwelves | 
tains an ever more dense sequence 474+. 
of points, which all lie on the same 

straight line (Fig.). 

It is customary to display the va- 
lues of the independent variable on 
the horizontal axis of a rectangular 
Cartesian coordinate system and  5.)-3 Graph of the function »y = x 
those of the dependent variable on 
the vertical axis. 

Explicit form. The form y = A(x) of the equation of a function, in which A(x) is an arbitrary 
expression that contains, besides the variable x, only numbers or elements of the basic number 
domain, is called an explicit form. 


complete 


110 5. Functions 


Implicit form. In contrast, an implicit form is characterized by the fact that both variables occur 
on at least one side of the equation, for example (1) 4x — 2y = 6;(2)xy = 1; 5 (3) y = sin x: sin y+x?; 
(4) x? + y? = 16; (5) x? + xy + »* = V(xy). If the equation ‘of a function is presented in explicit 
form, then as a rule one regards the variable that is isolated on one side of the equation as dependent 
and the other as independent; it is immaterial whether they are denoted by x, y; u, v; s, t or in any 
other way. With an implicit form it is not always so obvious. When x and y are used, it is usual 
to regard y as the dependent variable, but it is often necessary to explain one’s convention, especially 
if other variables are used. It is, however, also possible to regard both variables in an implicit 
equation as being of equal standing. It is important to note that an equation given in implicit form 
cannot always be rearranged in an explicit form. In examples (1) and (2) this is easily done; one 
obtains (1) y = 2x — 3 and (2) y = 1/x. Examples (3) and (5), however, defy all attempts to do this. 
In both examples neither y nor x can be isolated {see Chapter 4.). Another fact is shown clearly 
by the example (4). It is well known that x? + y? = 16 is the equation of the circle of radius 4 
about the origin of the coordinate system. In this case for each value of x there are two values of y 
that satisfy the equation. Regarding y as the dependent variable, a correspondence is defined that 
is not single-valued. For this reason (4) is not the equation of a function. The explicit form 
y = +y(16 — x”), on the other hand, does represent a function. But its image consists of 
only the upper semicircle. The equation of the function belonging to the lower semicircle is 
y = —y(16 — x*). Sometimes both functions are combined in the form y = +/(16 — x?). It 
would be wrong, however, to regard this way of writing it as the equation of a function that is 
many-valued; functions are single-valued correspondences, by definition. 


Parametric representation. This is concerned in the first instance with two explicit function equa- 
tions, each of which determines a function. The domain of definition in both cases is the same. 
Thus, in general form one has t > x = f,(t) and t— y = f,(t). If one now assigns to each x9 = fq (to) 
the value yo = f, (to), one obtains a mapping of the range of f, onto the range of f,, which need not, 
of course, be single-valued. 


Example J; Let x = 2¢ and y = t/2 with —oo < f< +00. Then the table of values for both 
functions is: 


The first and second lines refer to the function x = i and the first and third lines to the func- 
tion y = ¢/2. The values of x and »y belonging to the same values of ¢f determine a new correspond- 
ence, which is displayed in the second and third lines of the table of values. This new correspond- 
ence is clearly single-valued and is described by the new function equation y = x/4, as one can 
see at once from the table of values. From x = 2r it follows that f = x/2. Substituting the expres- 
sion x/2 for fin y = ¢/2, one obtains y = x/4; the parameter ¢ has been eliminated. 

Example 2: Let x = t? and y = t/2 with domain of definition —oo < ¢f << +00. The table of 
values is: 


However, in aan case the hd toss x—y is no 3 hae to each value of x 
there correspond two values of y. It can be made single-valued by restricting the original domain 
of definition, say to 0 <= ¢ << +-co, The correspondence x — y is then again a function with the 
equation y = j/x/2. 

Example 3: Let x = cos ft and y = 2f (domain of definition —co <— ¢ <— +o). The function 
x = cos /f is known to be periodic. When arbitrary values are chosen for rf, the same values for 
x between —1 and +1 (—1 < x < +1) are repeated over and over again. On the other hand, 
for y=21 the range is given by —co < y< + co, If one now considers the correspondence 
x —» y, it is clear that to one value of x there belong infinitely many values of y. One special value 
suffices to make this clear. One obtains x = | fort = 0, f = +22, = +42, etc. Thus, the values 
y=0,y = +42, y = +82 etc. belongtox = 1.A single-valued correspondence is again achieved 
only by restricting the original domain of definition, say to 0 < t = 2. The function then defined 
has the equation y = 2 arccos x with domain of definition —1 <= x < | and range 0 = y < 22. 


If a function x > y = f(x) is represented by two separate functions of the form x = f(t) and 
y = f,(t), the variable ¢ is called a parameter. By means of such a parametric representation a given 


5.1. Basic concepts 111 


implicit relation between x and y can often be represented by.two explicit functions; for example, 
x? + y?=1 by x=cost, y=sint with 0<¢t< 2n. To achieve uniqueness, the domain of 
definition for t must be suitably restricted. 


Composite functions. If the element a corresponds 
under the mapping G to the element 5, and under a 
further mapping F the element b corresponds to the 
element c, then by successive application of the two 
mappings F and G, one obtains a mapping under 
which the element a corresponds to the element c. 
The mapping defined in this way is called the pro- 
duct (or compositum) of the two mappings Fand G; 
thus, (a, c)é F: G if and only if there exists an ele- 
ment 5b, such that (a, b)¢€G and (b,c)eéF. Clearly 
the element 5 must belong both to the domain of 
definition X,; of F and to the range Y, of G (Fig.). we 
From this it follows that F- G can be formed only if 5.1-4 The composite F - G of the mappings G 
X; © Yg + ©. Furthermore, in carrying out the and F; the domain of definition Xr¢ (yellow) 
successive mappings the order is important, be- 8 the complete original with respect to G of 
cause, in general, F-G+G-F. If Xp, Xo, Xr-g the set Xr ~ Yo (green); the range Yro 

: ne (grey) is the image of X- ~ Yo with re- 

denote the domains of definition and Y;,, Yc, Yr.g spect to F 
the ranges of F, G, F- G, respectively, then F- G can 
be formed precisely when X; ~ Yo + ©; Xp-g € XG; 
Y--.g G& Yr . Stated more precisely, X;.g contains just those elements of X, whose function values 
with respect to G lie in X¥; ~ Yg, and Y;,., contains just those elements of Y; whose arguments with 
respect to F lie in X; ~ Yq. The product f- g of two functions f and g with the function equations 
y = f(x) and y = g(x) is often written as y = f[g(x)] and called the compositum of the two functions g 
and f, in this order. In this connection g is often called the inner function and f the outer function 
of the composite function /: g. 

Example: The domains and ranges of the functions g(x) = x? — 2 and f(x) = yx are 

X, = (—0, +00), ¥, = [—2,00) and X, = (0,0), Y, = [0,00). The composite function 

f-g has the equation f[g(x)] = )/(x? — 2) and its domain of definition ¥ reg Consists precisely 

of those elements of X, whose function values with respect to g lie in X, ~ Y, = [0, 00). But these 

are all x with the property x* > 2, that is, the set of all real numbers with the exception of the 

interval from —)|/2 to +)/2. The composite function g- f has the equation g[f(x)] = (yx)? — 2 

= x — 2 with domain of definition ¥,. - = [0, 0°). 


Special types of function 


In what follows the only functions to be considered are those whose domain of definition and 
range are contained in the set of real numbers. They are usually called real functions. According 
to certain general properties special real functions are collected together in groups, for example, 
monotonic, bounded, even, odd, or periodic functions. 


Monotonic functions. A function x > y = f(x) is said to be monotonic increasing in an interval 
a <x < 6 if for the greater x2 of two arbitrary values x, and x, in the interval the function value 
f(x2) also is always the greater; if x; < x2, then f(x,) < f(x2). 

Example I: The function y = 2* with domain of definition —co <— x < +c is a monotonic 
Increasing function in the whole of its domain. 

Example 2: The function y = sin x with domain of definition —co < x < +00 is monotonic 
increasing only in the intervals —Sx/2 < x < —3n/2; —n/2 << x <i n/2; 3n/2 << x < 5n/2; and so 
on, but considered as a whole it does not represent a monotonic increasing function. 

A function is said to be monotonic decreasing in an interval a << x < b if f(x,) > f(x2) whenever 
AX, << x2 <b. 

Example 1: The function y= 1/x decreases monotonically for —oo< x< 0 and for 
0< x < +o and is not defined for the value x = 0. 

Example 2: The function y = x? is monotonic decreasing for —co << x = 0. For x > 0 the 
function is monotonic increasing. 

Example 3: The function y = —3x + 5 is monotonic decreasing in the whole of its domain 
of definition. 


Sometimes a function is also called monotonic in an interval if x, << x, implies that always 
f(x1) <f(x2) [or that always f(x,) > f(x2)]. More accurately such functions are called non-de- 


112 5. Functions 


creasing (or non-increasing), and in contrast the functions already considered are strictly monotonic 
(increasing or decreasing). 


Bounded functions. A function x — y = f(x) is said to be bounded in an (open or closed) interval 
if there exists a number B > 0 with the property that | f(x)| < B for every value of x in the interval. 
In particular, if |f(x)| < B for every value of x in the domain of definition, then x > y = f(x) is 
said to be a bounded function. 


Example 1: The function y = x? is bounded in every closed interval. For example, in the inter- 
val 0 < x <a, |f(x)| < B = a’. However, it is not a bounded function, because for the domain 
of definition —coo < x < + oo no number B can be found that is not exceeded by any value of 
the function. 

Example 2: The function y = x~? is bounded for every interval of the form a < x < +00, 
with a > 0. It is not bounded for 0 < x = b. 

Example 3: The function y = (100 — x?) is bounded in the whole domain of definition 
—10< x < +10, because 00 — 5 <= 10 always holds (Fig.). 


Example 4: The function y = = = is said in the whole domain of definition, as can 
be seen by writing it in the eee =l1- Gacy . Here} 1 ——; wails! for every value 
of x. + ! rt! 


The graphical representation of a bounded function is characterized by the fact that two lines 


parallel to the x-axis can always be found so that the graph of the function lies entirely between 
them. 


| * §.1-5 Graph 
10 of the function 
y = / (100 — x?) 


-70 
5.1-6 Graphical 
representation of 
the even function 
y = |x| and the 
odd function 


even functions 


odd functions 


== at | y=x y= —1/x 
y= i v= —1/x Pi : 
y=(? — 1) J? +1) | y=x/2 Even and odd functions. A function x > y = 
y=a'x y =a- x2 f(x) is said to be even if f(—x) = f(x) for every 
Grae Oe value of x in the domain of definition. A func- 
a+0,n=0, 21, +2, -.. tion x > y = f(x) is said to be odd if f(—x) = 
y = cos x y=sinx —f(x) for every value of x in the domain of defi- 


nition. 
The graph of an even function is symmetric about the y-axis. The graph of an odd function is 
symmetric about the origin (0, 0). It goes into itself under a rotation through 180° about this point 
(Fig.). 


Periodic functions. A non-constant function x > y = f(x) is said to be periodic if there exists a 
number a > 0 such that f(x) = f(x + a) for every possible value of x. It then also follows that 
S(~) = f(x + 2a) and f(x) = f(x — a), in general, f(x) = f(x + na) for every integer m, as long as 
the values (x + na) belong to the domain of definition of the function. Each such number a is called 
a period, and the smallest positive number k for which f(x) = f(x + k) is called the primitive 
period of the periodic function. The graphical representation of a 
periodic function is a graph that goes into itself when translated 
in the direction of the x-axis through a distance equal to an inte- 
gral multiple of a period (Fig.). The best-known periodic func- 
tions are the trigonometric functions. From these further periodic 
functions can be .constructed; for example, the functions y = b 
sin (ax) with b + 0 and a + 0 have the period 22/a. Combined func- 
tions such as y = b, sin (a,x) + bz sin (a2) are periodic, provided 
that the ratio of a, to a, is rational, that is, if asta, = mln, 
: “odi _ where m and n are relatively prime integers. The period of the 
on. ni same pes first function is 27/a, and that of the second is 27/a2, and their 
k=2 ratio is (27/a,): (22/a2) = a2/a, = n/m. Thus, n periods of the 


5.1. Basis concepts 113 


first function correspond exactly to m periods of the second function. Consequently the sum 
function has the period m- (27/a,) = n° (2/a2). 


Example: The periods of the indivi- 
dual functions of the sum function 
y=sin (2x)+ 2 sin (3x/2) are a and 
4x/3 and their ratio is 2/(42/3) = 
3/4. The given function therefore has 
the period 47 (Fig.). 


Inverse of a function 


Invertible functions. The single-valued 
correspondence determined by a func- 
tion between the elements of the do- 
main and the elements of the range, 
conversely also assigns to each element 
of the range one or more elements of 
the domain. Functions for which each 
element of the range occurs only once 
as the image of an element of the do- 
main have a special significance, because 
the inverse of the correspondence is 
also single-valued. To each element r 
of the range there belongs only one 
element d of the domain. In this case 
the range of the given function f can 
be regarded as the domain of a new 5.1-8 Graphs of the functions y = sin (2x), y = 2 sin (3x/2) 
function g. If the given function fdeter- 9"¢” = sin (2x) + 2 sin (3/2) 
mines the correspondence d>r=f(d), 
then for the new function one has r -> d = 9(r). In other words, (r, d)€ p if and only if (d, r) ef. 
Functions for which in this sense the correspondence between the domain X and the range Y can 
be inverted are called invertible functions (Fig.). These are one-to-one correspondences of X onto 
Y. Monotonic functions belong to the class of invertible functions: a monotonic function is always 
invertible. On the other hand, an invertible func- —~ , 
tion need not necessarily be monotonic; for : 
example, the domain and range may not be 
ordered sets, so that the concept of monotonicity 
is not defined. Again, a non-monotonic function 
can also be invertible, for instance if the domain 
and the range consist of only finitely many ele- 
ments. An example of this is the function given 
by the following table of values: 


x 123 45 67 8 9 10 
5.1-9 Graph of anon-invertible function (left) and 
y 02468 13 §5 T 9 an invertible function (right) 


Inverse function. If one regards the range Y of an invertible function f as domain of definition of 
a new function gy, whose range is the domain X of f, and if one reverses the singlevalued correspon- 
dence between the sets XY and Y given by the function f, then one obtains the inverse function p of 
the given function f. The inverse function is itself an invertible. By considering d— r = f(d) and 
r-» d= (r) it is easy to see that the inverse function of the inverse function of a given function f 
is f itself. Thus, one is justified in calling f and g mutually inverse functions. 


Example I: 
Function f 
domain ie a Ae 
range ab cc d@ 2 


If y = f(x) is the equation of an invertible function, then the same equation naturally also de 
scribes the inverse function, only y must then be the independent and x the dependent variable 
It is agreed, however, that in a function equation of this form x shall always denote the independen 
and y the dependent variable and, whenever possible, the explicit form of the function equation 
shall be given. One therefore rearranges the equation as follows: 


114 5. Functions 


1. In the given function equation y = f(x), y is regarded as the independent and x as the dependent 
variable. 

2. Denoting the independent variable by x and the dependent one by y, x = f(y) is an implicit 
form of the equation of the inverse function. 

3. If this equation can be solved for y, one obtains y = ¢(x) as its explicit form. 


Example 2: From the function equation y = x/2 of a given invertible function one obtains 
x = y/2 after interchanging the variables. Solving for y gives y = 2x. 


The function y = x/2 
with the domain of definition —l=x=2 with th 
and the range —1/2<y<1 | and the rang ya 3. 
Example 3: From the given invertible function y = 3x + sin x, interchange of the variables 
gives the function equation x = 3y + sin y, which cannot be solved explicitly for y. Thus, the 
inverse function must be given in the implicit form 3y + sin y — x = 0. 


Graph of the inverse function. Because of the uniqueness of the mapping represented by a function, 
every line parallel to the y-axis cuts the graph in only one point. If the function f(x) has an inverse 
function g(x) and is therefore one-to-one, then each line parallel to the x-axis also cuts the graph 
of the function in only one point. This curve represents both the correspondence x — y and the 
correspondence y > x. Because of the interchange of the variables in the inverse function each 
particular number pair (a, b) of the function f becomes a number pair (5, a) of the function 9. 
The points corresponding to these number pairs (a, b) and (6, a) are mirror images of one another 
in the angle bisector of the first and third quadrants of the Cartesian coordinate system. Consequently 
the graph of the inverse function g(x) is obtained by taking the mirror image in this angle bisector 
of the graph of the given function f(x) (Fig.). 


_—i 


+t 
fr 
i 


>.1-11 Graph of 

y = arcsin x; principal 
value y = Arcsin x is 
drawn in black 


fi 5.1-10 The function curve of the inverse function 


Inverses of functions in particular intervals. In the discussion of monotonic 
functions it has already been shown, that non-monotonic functions may be he 
monotonic in certain intervals of the domain of definition. In these intervals + 
they are also invertible. a i. 


Example 1: The function y = x* is monotonic and invertible in the 


interval 0 < x < +oo, In this interval its inverse function is y= /x. . 
Naturally it is also monotonic and invertible in the interval —co < x < 0. / 
Here the inverse function is y = — yx. 


Example 2: The domain of definition of y = sin x can be split up into intervals in which the 
given function is monotonic. The inverse function is denoted by arcsin .x, but in each case the range 
must be stated, because otherwise it is not clear in which interval of monotonicity the inverse 
has been formed. For example, if y = sin x is inverted in the interval 32/2 < x < 52/2, then the 
inverse function should be denoted by y = arcsin x (3n/2 < y < 5m/2). If the range is not spe- 
cified, then arcsin x is always understood to mean the principal value, which lies in the interval 
[—2/2, +-2/2] and is denoted by Arcsin x (Fig.). 

Example 3: Also for the other trigonometric functions intervals can be chosen in which they 
are monotonic, so that in them circular functions are defined as their inverses. The function 
y = cos x, for example, decreases monotonically in the interval 0 <x < +2 from y= +1 
to y = —I1 and in doing so assumes all values of its range exactly once. Hence in this interval 
an inverse function exists. It is denoted by y = arccos x. Its domain of definition is —1 < x < +1 
and its range is x > y > 0. If the function y = cos x is inverted in another interval in which it 


5.2. Polynomial and rational functions 115 


is monotonic, say in the interval 7 <= x < 2n, then y = arccos x has the range 2 < y < 2x. 
In order to specify which inverse function is intended, the range must be given in each case. 
If it is not specified, then arccos x is to be understood as the principal value, which is characterized 
by 0 < arccos x < a and is denoted by Arccos x. 

A similar result holds for the function y = arctan x in the interval —2/2 < Arctan x < +2/2 
and for y = arccot x in the interval 0 << Arccot x < +2 (see Chapter 10.). 


5.2. Polynomial and rational functions 


The concept of a rational function 


An expression of the form a,x" + ay_ yx"-1 +... + a,x + ag, where n is a natural number, the 
coefficients a, are arbitrary 


real numbers, and a, + 0, is Examples of rational functions, 1. y = 8x — 3. 
called a polynomial of de- 4x? + 1 » 2 ain Pi 
gree n. OE dae or TEE 3.y=Vl10-x gine. 4. » = 1/x*. 

A rational function is a . : ‘ 
function of the form p/q, Examples of non-rational functions. 1]. y = Vx". 2. y = cos* x. 
where p and q are polyno- x3 x5 oy? co oy n+] 


mial functions and at least 3.y=x- 3r + eo on Ba = 2 GntD! 
one coefficient of g is not 0. 

For polynomial functions the domain R of all real numbers can be chosen as domain of definition. 
If no restriction is indicated on account of special conditions, R is always regarded as the domain 
of definition. The same holds for rational functions, except that those values for which the denomina- 
tor vanishes must be excluded. It should also be pointed out that rational functions are continuous 
and differentiable arbitrarily often in the whole of their domain of definition. 

In the following, first the polynomial functions and then the rational functions are considered. 
Before establishing general properties, certain special types of such functions that occur particularly 
often are examined first. 


Linear functions 


The functions y = mx. The tables of values of the functions y = x, y = x/2, and y = —4x/3 
give number pairs (x, y) from which one obtains points of the graphs of these functions in a Cartesian 
coordinate system (Fig.). 


Because the pair of values (0,0) always occurs, the curve always passes through the origin of 
coordinates. The curves are straight lines, because from y = mx the coordinates of arbitrarily 
chosen points P;, P2,...,P always satisfy y,/x,; = y2/x2 = --- = y/x = m, where for each func- 

aa@e ie if i ae fee ert ees | PeRSe Poe eee eee 


‘ 
Li } | 
ee 


ASSed ints Sess IES! HE els pili 5.2-1 Graphs of the functions 


on 

Nee . 

Berane iat 
Bee porere ttt) | | 


5.2-2 The graph of y = mx is a 
straight line 


116 5. Functions 


tion m is a constant (Fig.). If P;,, Pox, ..-, P, are the projections’ of the points P;, P,, ..., Ponto 
the x-axis, then the triangles OP; P,,, OP2P2,, ..-, OPP, ate similar. Since the points P,,, P2,, .--, Py 
lie on a straight line, the points P;, P2, ..., P must also lie on a straight line. Because m is a constant, 
the corresponding coordinates of different points are proportional, y,/x; = y2/x2. The magnitude 
of y is directly proportional to the magnitude of x; the constant m is the factor of proportionality. 
If the rate L for a job is proportional to the working time ¢ in hours, the connection between the 
two is represented by the linear function L = mt. The constant of proportionality represents the 
rate per hour. 

From the graph of the linear function y = mx and the table of values of the 
special functions y = x, y = x/2 and y = —4x/3 it can be seen that the function 
is monotonic; for positive m it is monotonic increasing and for negative m it is 
monotonic decreasing. With reference to roads and railways the constant is called 
the gradient (Fig.). In mathematics the gradient is defined as the ratio of the 

| difference in height BC to the horizontal distance AB (Fig.). It is given as a ratio 
5.2-3 Signof Or as a percentage. For example, 1 : 50, 3/150, 1/50, ass 2% = 0.02 all have the 
a steep hill same meaning. 


C 


bs fie Acé 


heigh ; 


, , Sloe 
' Aorizontal distance 'B 


5§.2-4 Gradient 


5.2-5 The function y = mx + c¢ 


The functions y = mx +c. If for every value of x on the straight 
line y = mx a fixed quantity c is added to or subtracted from the ordi- 
nate y, this signifies a translation of the line y = mx, which one recog- 
nizes most easily as the intercept c on the y-axis (Fig.). Thus, the 
graph of the function y= mx +c is a straight line with gradient | 
m and intercept c on the y-axis (see Chapter 13. — Cartesian normal 5.2-6 Graphs of further 
form). functions y = mx +c 

In drawing the line one need not actually carry out the trans- 
lation. The graph of the line is obtained by first marking off the intercept c on the y-axis, thinking 
of a parallel to the x-axis through its end-point and construct- 
ing the gradient with respect to it (Fig.). 

Implicit representation of the linear function. The graphical 
representation of Ax + By + C= 0 in a Cartesian coordinate 
system is always a straight line (see Chapter 13.), provided that 
A and B are not both equal to zero. Moreover, Ax + By + C=0 
can be expressed as a linear function in explicit form only if 
B=+ 0. The rearrangement in explicit form then gives y = 
—(A/B)x — (C/B) or y=mx-+e, with m= —(A/B) and 
c = —(C/B). For A=0 and B+ 0 the result is a constant 
function, whose graph is a line parallel to the x-axis. For A + 0 
and B= 0 the equation does not represent a function at all. 
The graphical representation of the equation Ax + C = Ois a 
line parallel to the y-axis. 


Quadratic functions 


The function y = x*. The function equation y = x? leads toa 
curve known as the standard parabola (Fig.). 


Table of values for y = x? 
x —3 —2 —1 0 1 2 3 


y 9 4 1 0 1 4 9 


Intermediate values to those of the table of values are given by 
a table of squares, which is nothing more than a skilfully arrang- 52-7 Standard parabola as graph 
ed table of values of the function y = x? of the function y = x? 


5.2. Polynomial and rational functions 117 


Properties. Because x? > 0 for every value of x, the curve always remains above the x-axis; thus, 
to the domain of definition —co < x < +00 there corresponds the range 0 << y< +00. The 
standard parabola is symmetric about the y-axis (axially symmetric). The zero point, which is sym- 
metric with itself, is called the vertex. The curvature of the standard parabola, in contrast to the 
straight line, shows up calculations by the fact that y changes by ever greater amounts as |x| in- 
creases uniformly. In the table the difference sequences 4x and Ay are introduced, as well as the 
sequence of differences for Jy, which is denoted by 4?y. It shows that Ay increases for constant 
Ax and only the second difference sequence A*y is constant. 


Ax ow. 1 1 1 1 1 1 1 1 = 
x a ee | 0 1 2 3 4 5 6... 

y oe A 0 +1 44 49 416 +425 +36... 
Ay =s3 =) =i +3 +5 ae ae +11 

Aty ... 2 2 2 2 2 2 2 


For an intuitive understanding of curvature one can imagine that a motor car is travelling along 
the curve in the direction of increasing values of x. If the steering wheel must be turned to the left 
in order to remain on the curve, then the curve is said to have a positive curvature, and if to the 
right, then it has a negative curvature. Thus, the standard parabola has positive curvature throughout. 


The functions y = x” + px + q. By completing the square that is, by introducing the square 
of half the coefficient of the linear term px, the given function can be expressed in the form 
y=(« — a)* + bbyy = x? + px +q= x? +px + (p/2)?—(p/2)? + gq = (x + p/2)? + (q — p?/4). 
Writing a = —p/2, b = (q — p?/4), one obtains, in fact, y = (x — a)? + b or (y — 65) = (x — a)’, 
that is, 7 = &*, where y — b = n and x — a = &. This means that the graph of the function in the 
&, n-coordinate system is again the standard parabola 7 = &*. But the &, 7-system is transformed 
into the x, y-system by the linear transformation x — a = £, y— b=, corresponding to a transla- 
tion (Fig.). In the x, y-system the vertex V of the standard parabola 7 = &? has the coordinates (a, b); 


5.2-8 
Graphs of the functions 
y=x'+b 


5.2-9 Graphs of the 
functions y = (x — a)? 


expressed in terms of the coeffi- 
cients p and q of the given quad- 
ratic function y = x?+ px + q, 
the coordinates of the vertex are 
(—p/2, q — p?/4) (Fig.). 


Example: Completing the square in the equation of the 
function y = x? + 6x + 11 gives y= (x? + 6x+ 9) —9-+ 11, 
or y = (x + 3)? + 2. Thus, one can read off at once that the 
graph is a translated standard parabola with vertex V(—3, 2). 


The general quadratic function y = Ax? + Bx + C. In this 
equation it is assumed that A + 0, otherwise the function is not 
quadratic at all. Thus, the factor A can be taken out: 

y = A[x? 4+ (B/A) x + (C/A)] = A: Y. The graph of the qua- 
dratic function 

Y = x? + (B/A) x + (C/A) = x? + px+q, 

Y = (x + p/2)? + q — p?/4) e406 dak ok 

-_ 2 _ p2 2 .2- raph o 

[x za B/(2A)] a [C/A B [(4A )I, y=(x—ajit+b obtained by 
where p = B/A, gq = C/A, is known. The values of Y are given _translatingthe standard parabola 


118 5. Functions 


as the sum of the ordinate values [x + B/(2A)]* of the standard y 2 
parabola, whose vertex has been translated by an amount | ya2x? te 
p/2=—B}(2A) in the direction of the +-x-axis, and the quantity | i 


b = q — p?/4 = [C/A — B?/(4A?)] of the translation in‘ the direc- 
tion of the +)-axis. But the equation y = A- ¥ states that each 
of these values of Y is to be multiplied by the number A. For 
A > 1, all the ordinates of the standard parabola and also the 
segment [C/A — B*/(4A*)] are stretched in the ratio A: 1, while 
for A between O and 1 they are contracted in the same ratio 
(Fig.). If A takes negative values, this stretching (|A| > 1) or 
contraction (|A| < 1) is followed by a reflection in the x-axis. ; 


Example 1: The graph of the function y = —x? is the mirror 
image in the x-axis of the standard parabola. 

Example 2: The graph of the function y = x?/4 is the standard 
parabola contracted in the ratio (1/4): 1 = 1: 4. 

Example 3: The quadratic function y = 3x? — 4x — 1/6, 
by taking out the factor 3 and completing the square, can be written in the form: 


y = 3[x? — (4/3) x + (2/3)? — (4/9 + 1/18)] = 3[(x — 2/3)? — 1/2]. 


Hence the graph of the function is a standard parabola stretched in the ratio 3: 1, whose vertex V 
has the coordinates (2/3, —3/2). 


eres erg! SEBEL Fe a bees sade: oe Cubic functions 


The function y = x*. In a table of cubes one has available 
an extensive and clear table of values of the function y = x3; 
with its help one obtains the graph of the function, the cubic- 
al parabola or parabola of degree three (Fig.). 

Properties. The function increases monotonically in the 
whole of its domain of definition —co < x << +0; it is an 
odd function, and its graph is therefore symmetrical about 
the origin of coordinates. For |x| > 2/3 the cubical parabola 
is steeper than the quadratic parabola; its third difference 
sequence is constant: 


5 = 


5.2-11 Graphs of the functions 
y = x*/2 and y = 2x? 


Ax... 1 1 1411 21 21 .. 
oe =e £8 2-2 al O 1 2 3 4: 
y «64 —27 -—8 -1 01 8 27 64... 
Ay ... 37 9 7 #11 7 «19 #37... 
A2y ... —18 —12 -6 0 6 12 18 

A3y ... 6 6 666 6 


The curvature of the cubical parabola is negative for x < 0 
and positive for x > 0, and it changes its sign at the origin. 
Such points are called points of inflection. Thus, the cubical 
parabola has a point of inflection at the origin. 


5.2-13 Graph of the function 
. y= x® — 3x? —x+3 

§.2-12 Cubical parabola with the 

function equation y = x? 


Other cubic functions. In order to investigate the graphs and 
the properties of other cubic functions, one often considers them 
in relation to the function y = x3 represented in the same coordi- 
nate system, whose graph is therefore also called the comparison 
cubic or the standard cubical parabola. The graph of the func- 
tion y = —x3, for example, is the mirror image in the x-axis of 
the standard cubical parabola. To the function y = kx? with the 
stretching factor k >0 there belongs a cubical parabola obtain- 
ed from the standard cubical parabola by stretching for k > 1 
or contracting for k < 1. Finally, the graph of the function 
y = (x — a)? + 5 is obtained from the standard cubical pa- 
rabola by translation parallel to the axes of coordinates, with 
the new centre of symmetry Z = (a, b). 


5.2. Polynomial and rational functions 119 


The general cubic function y = Ax? + Bx? + Cx + D always has three zeros of which, under 
certain conditions between the coefficients, two can be conjugate complex. In the differential calculus 
it is shown, in addition, that when it has three real zeros, the function has two extrema, one (local) 
maximum and one (local) minimum. The example shows that the graph of such a function cannot 
be obtained by simple transformations from the standard cubical parabola y = x3 (Fig.). 


Example: y = x? — 3x? — x + 3. 
Table of values x Dee] 


Power functions with positive exponents 


Concept of a power function. A function y = x", in which n is an integer, is called a power function. 
If n is positive, the function is a polynomial, but if 7 is negative, n = —¥v (y > 0, an integer), then 
the function can be expressed in the form y = 1/x” and is a rational function. 

The polynomial functions y = x" are even if their exponent n = 2mm is even; they decrease mono- 
tonically for —co < x <0 and increase monotonically for 0 << x < +00. For odd exponents 
n = 2m -+ 1 the functions y = x" are odd; they increase monotonically everywhere. 


Even polynomial power functions y = x?". The curves represented by these functions are sym- 
metrical with respect to the y-axis, and their curvature is everywhere positive (Fig.). Each of them 
contains the origin (0, 0) and the points Q(—1, +1) and P(+1, +1). In the neighbourhood of the 
vertex (0, 0) the tangents are the flatter the larger m is, whereas in a particular neighbourhood of 
the points Q and P they are the steeper, the larger m is. To every point (x;, y,) on y = x?™. a point 
(X2, 2) On y = x?"s(m, > m,) can be determined by means of the differential calculus, so that the 
tangents at the two points are parallel. These curves are called parabolas of order 2m (Fig.). 


5.2-14 Graphs of the functions y = x?" for ar =F | 
m = 0, 1, 2, ..-; y = x®° is not defined for x = 0 07 O2 03 04 OS 06 O7 O08 og 7 


5.2-15 Portions of the curves of the functions 
y = x*” as parabolas of order 2m 


Odd polynomial power functions y = x?"+1. The curves are centrally 
symmetric about the origin. Except for the angle bisector of the qua- 
drants I and III (y = x) they have negative curvature for all negative 
values of their domain of definition (—co < x <0) and positive 
curvature for positive values (0 < x < +00), and so they have a point 
of inflection at the origin. Each of these parabolas of order 2m-+ 1 
contains the points (+1, +1) and (—1, —1), and in the neighbour- 
hood of these points their tangents are the steeper, the greater m is; 
but in the neighbourhood of their common point of inflection (0, 0) 
the tangents are the flatter, the greater m is (Fig.). 


5.2-16 Graphs of the functions y = x?"*! for m = 0, 1,2 


120 5. Functions 


Polynomial functions 


The expression a,x" + a,_,x""1 + --- + a,x + ao, where a, + 0, has been defined above to be 
a polynomial of degree n. A polynomial is a special kind of rational function. 


Example: 


y = 2(x? — 1)? + (x + 2) * — 2) — 2x4 x? -1 
= 2x4 — 4x? + 24 x* — 2x + 2x3 — 4 — 2x + x? — 1, or py = 3x* + 2x7 — 3x? — 4x — 3 


is a polynomial of degree 4 with the coefficients ag = 3, a, = 2, ag = —3, ay = —4, dg = —3. 


Uniqueness of polynomial representation. The assumption that two different polynomials can 
represent the same rational function leads to a contradiction. Because the two polynomials 


Yq = AnxX"+ ay_ya"™* +--+ a,;x+a 9 and y= byx™ + bm—1x™ 1 + +++ + byx + bo 


are assumed to be different, it must be that either 7 + m, or if n = m, then a; + 5; for at least one 
pair of coefficients. Their difference 

(Ayx" + Ay_yx"—* + ++ + AyX + Ag) — (Dy x™ + By ax™* + +++ + yx + bo) 

can be arranged according to powers of x; it is a polynomial with at least one non-zero coefficient, 
whose degree does not exceed the greater of the two numbers m and n. It represents a polynomial 
function, which has only finitely many zeros, not more than its degree. But since y, and y, are 
assumed to be the same function, their difference must be identically zero, that is, zero for all values 
of x. This contradiction leads to the conclusion that both polynomials have the same degree, m = n, 
and that their corresponding coefficients are equal, a; = b,;, because only then is the difference of 
the two polynomials identically equal to zero. 

In this sense one speaks of the uniqueness of the representation of a polynomial function. The 
conclusion about the equality of corresponding coefficients is often used to determine the coef- 
ficients of a polynomial by equating coefficients, for example, in decomposition into partial fractions 
and in solving differential equations. 


Factorization of polynomials 


A polynomial P(x) of degree n > 1 is called reducible if it can be expressed as a product of poly- 
nomials of lower degree. If such a representation is not possible, the polynomial is called irreducible. 
Polynomials of degree zero are constants; they are excluded from this classification, since they 
are neither reducible nor irreducible. Polynomials of the first degree are then always irreducible. 

If a polynomial P(x) of degree n is reducible, that is, if it can be split up as a product 
P(x) = p(x) p2(x), then the polynomials p,(x) and p2(x) must be of degree at least equal to 1 and 
necessarily smaller than x. If p,(x) or p2(x) is reducible, the process can be repeated; after at most 
n steps the polynomial is factorized into a product P(x) = g(x) A(x) k(x)... With the help of the 
theorem (not proved here) that an irreducible polynomial dividing a product of two or more poly- 
nomials must divide at least one of them, it can be shown that the factorization of a reducible poly- 
nomial is unique apart from the order and up to constant factors. If P(x) = g,(x) Ay (x) ki(%)... 
and P(x) = g(x) A(x) k(x) ... are two factorizations into irreducible factors, then g,(x) must divide 
one of the polynomials g(x), h(x), k(x)... But because these are themselves irreducible, it must be 
equal to one of them, except for a constant factor c,. Without loss of generality one can assume 
that g(x) is this polynomial. Then g,(x) = c, g(x). Dividing P(x) by g(x), one obtains: 


Ch, (x) k(x)... = h(x) k(x) ... 


By the same argument it follows that A,(x) = c2h(x), say, and c,c2k,(x) ... = k(x)... Thus, except 
for constant numerical factors the two factorizations agree overall. 

The question as to whether a polynomial is reducible depends essentially, of course, on which 
number system its coefficients and those of the irreducible factors belong to. For example, x? — 1/4 
is irreducible over the domain of integers, but reducible over that of the rational numbers; x? — 2 
is rationally irreducible, but factorizes into (x — /2) (x + 2) over the real numbers; and x? + 4 
is irreducible over the domain of real numbers, whereas it factorizes into (x + 2/) (x — 2i) over 
the domain of complex numbers. If arbitrary complex numbers are allowed for the coefficients 
of the polynomial, then the fundamental theorem of algebra shows that each polynomial function 
of degree n can be factorized into n linear factors (x — a,), k = 1, 2, .... 2. The numbers «, are the 
zeros of the function. In the case of polynomials with real coefficients, if one of these numbers 
« =a-+ bi is complex, then the conjugate complex number & = a — bi also occurs as a zero. 
The product of the corresponding linear factors is then (x — «) (x — &) = (x — a — bi) (x — a+ Di) 
= (x — a)? + b? = x? — 2ax + (a? + b?), that is, a quadratic polynomial with real coefficients. 
If all pairs of conjugate complex linear factors are collected together in this way, then over the field 


5.2. Polynomial and rational functions 121 


of real numbers the irreducible factors are either real linear factors or quadratic polynomials with 
real coefficients. When the coefficients are restricted to the field of real numbers, it follows that all 
irreducible polynomials are of degree at most 2. If it is further required that the coefficients of the 
given polynomial and its factors shall be rational, this theorem no longer holds. For example, the 
factorization (for real numbers) x* — 5 = (x? + /5) (x? — //5) is no longer allowed. 


Zeros 


A number « is called a zero of a function x > y = f(x) if the number 0 is assigned to the number 
o by the function, that is, if « + f(«) = 0. Thus, for a polynomial f, f(«) = a,x" + a,_yx""! + -- 
+ Q\Xx + ay = VY. 

In the graph of a function a real zero appears as an intersection or point of contact of the curve 
with the x-axis. 


If « is a zero of a polynomial f(x), then f(x) is divisible by (x — a); thus, there exists a polynomial 
g(x) such that f(x) = (x — «) g(x). 


In any case the polynomial f(x) can be divided by (x — «), where « is arbitrary. By this division 
one obtains a quotient function g(x) of lower degree than f(x), and the remainder r must be of lower 
degree than (x — «), and must therefore be a constant: f(x) = (x — «) g(x) + r. If « is a zero, 
then for x = « this equation becomes 0 = 0- g(«) + r. Thus, the remainder r must be zero, the 
function f(x) is divisible by the linear factor (x — «) without remainder, and f(x) = (x — «) g(x). 
A generalization of this theorem can be proved by the method of induction: 

Ife, , %a, 8a, «.- » X are zeros of a polynomial f(x), then the product (x — «,) (x — %2) --- (x — %) 

is a factor of f(x), that i is, f(x) can be expressed in the formf(x) = (x — «,) (x — a2) +--+ (x — aR e(x). 


Example: The polynomial f(x) = xe 5x? + 7x — 3 has the zero x = 3. tf a en 3) 
gives the quotient x7 —2x-+-1, so that the polynomial can be expressed in the form 
F(x) = (« — 3) (x? — 2x +: 12) = (& — 3) (x— 1)’. 


A polynomial f(x) = a,x" + a,_,x"~' + --» + a,x + dp has at most n distinct zeros. 


Proof by induction: 1. For n = 1, that is, for the polynomial a,x + do with a, + 0 (since otherwise 
the elegy is not of the first degree), the theorem holds, because this polynomial has the single 
ZerO X = —o/a,. 

2. Let f(x) be a polynomial of degree m+ 1 and « a zero of this polynomial. By the previous 
theorem, f(x) can be expressed in the form f(x) = (x — «) g(x), where g(x) is only of degree n. 
The product (x — «) g(x) can be zero only if at least one factor is zero. The first factor is zero for 

= «, and by the inductive hypothesis the second factor is zero for at most m further values of x. 
Thus, the product, and hence f(x), is equal to zero for at most m+ 1 different values of x. The 
theorem is now proved. 


Multiplicity of a zero. It can happen that a polynomial with a zero « is divisible not only by 
(x — x), but also by (x — «)?, (x — «)? or a still higher power of (x — «). If f(x) is divisible by 
(x — «)* but not by (x — «)*+!, then « is called a k-fold zero or a zero of multiplicity k (k > 1, an 
integer) of f(x). 


Example: The polynomial x* — 9x? + 27x? — 31x + 12 has a double zero for x = 1, that is, 
it is divisible by (x — 1)? but not by (x — 1)°. 


f(x) = (x — 1)? (x? — Tx + 12) = (x — 1)? (x — 3) (x — 4). 


5.2-17 Function curves in the neigh- 5.2-18 Function curves in the neighbourhood of a multiple 
bourhood of a simple zero zero, in c of odd order, in d and e of even order 


122 5. Functions 


Simple and multiple zeros. Different multiplicities of a zero give rise to different behaviour of the 
graph of a function in its neighbourhood. At a simple zero (of multiplicity 1) the curve always has 
a slope different from zero (positive or negative), while at a multiple zero the slope is zero, so that 
the tangent to the curve at such a point coincides with the x-axis (Fig.). 

Zeros of even and odd order. The behaviour of the curve also differs according to whether the 
multiplicity, or order, of a zero is even or odd. Let « be a zero of f(x) of multiplicity k. Then f(x) 
can be factorized in the form f(x) = (x — «}* g(x), where g(x), because of its continuity, is different 
from zero in a neighbourhood of «, and consequently does not change sign in this neighbourhood. 
Thus, there exists an e€ > 0, such that g(x) + 0 for each x with |x — «| < e. But the linear factor 
(x — ‘o) changes its sign as x passes from x < «tox >« (Fig. ). The polynomial f(x) = (x — «)}¥ g(x) 
therefore changes its sign in this passage if and only if k is odd (Fig.). For even k, f(x) keeps the 
same sign. 


Zeros and factorization. Every polynomial function can be expressed as a product of irreducible 
factors in the form 


F(x) = c(x — 1)" (x — 2)2 ... (XX — xy) (x? + yx + By) (x? + gx + bp) 8... (x? + ax + B81. 


In this expression c is a polynomial of degree zero, that is, a constant different from zero, «;, a; 
and Dy denote real numbers, and r; and s, are natural numbers. The exponents satisfy the condition 


n= = r,+2 = s;. It is immediately obvious that the « are zeros of the function. Moreover, there 
i=l 


are no other fal zeros. For each value « of x other than «,, «2, ..., «,, each linear factor (x — «,), 
i= 1,2,..., k, is different from zero. If one of the quadratic polynomials x? + a,x + b, were equal 
to zero for x = «, then it would be reducible, contradicting the assumption. An irreducible quadratic 
polynomial is different from zero for all real values of x, because it has only two conjugate complex 
zeros. As a consequence of these arguments one obtains the theorem: 


The number of real zeros of a polynomial, counted according to their multiplicity, is even precisely 
when the degree of the polynomial is even, and odd when it is odd. In particular: A polynomial of 
odd degree has at least one real zero. 


Sturm’s theorem. By means of an approximation method, for example, that of NEwToN, every 
root of each polynomial function can be calculated with arbitrary accuracy if a value of x in the 
neighbourhood of the zero is known. Already DESCARTES, NEWTON and Fourier, had tried to find 
criteria to decide whether a root of a polynomial lies in a given interval of the domain of defini- 
tion. By suitable choice of the interval values of x in the neighbourhood of the zero can be found. 

Descartes’ rule of signs. DESCARTES considered the signs of the coefficients of the polynomial 
f(x) = a,x" + a,_yx"-1 + ++» + a,x + ao. Here it can be assumed that neither a, nor ao is zero. 
Other coefficients that are zero are not included in the sequence a,, d,_;, ---, 41, 49. If two neigh- 
bouring coefficients have different signs, one then speaks of a sign change. 


Descartes found that the number of positive zeros of a polynomial is equal to the number of its 
sign changes, or differs from it by a positive even number. The number of negative zeros is obtained 
similarly from the number of sign changes in the sequence of coefficients of the polynomial f(—x). 


Example: The polynomial f(x) = x* — x* + 2x3 + x? — 3x + 2 has four, two, or no positive 
zeros, because four sign changes occur in the sequence of coefficients 1; —1; 2; 1; —3; 2. If one 
forms f(—x) = —x* — x* — 2x3 + x? + 3x + 2, one sees that f(x) must have exactly one nega- 
tive zero, because one sign change occurs in the sequence of coefficients of f(—x) 


The exact number of zeros is given by a theorem due to Sturm. He begins by factorizing 
the polynomial 


F(x) = C(x — og) — 06g )8 0+ ( — Oy) (x? Fay x + By) (X? + ax + Bg) o+ (x? + yx + 5,5. 

If multiple irreducible factors occur in this, it is enough to consider the polynomial 

P(x) = (x — 04) +++ (% — 04) + (x? + ayx + By) + (x? + x + 5), 
which contains each of these factors, but each one only once; p(x) then has the same zeros as f(x), 
but only simple zeros. 

The derivative g(x) arises as a sum by differentiating the product using the product rule. In each 
term of the sum one of the irreducible factors is differentiated, so that each term does not contain 
one of the factors into which g(x) can be split up. The sum is not divisible by any one of these 
factors: (x) and g’(x) have no common factor except for a constant. If one divides g(x) by y’(x), 
one obtains a polynomial q,(x) and a remainder —q,(x), which is a polynomial of lower degree 
than 9’(x); 9(x) = 91(x) 9’(x) — 92(x). Dividing y’(x) by (x) therefore gives a new remainder 


5.2. Polynomial and rational functions 123 


term —q3(x), for which (x) = q2(x) 92(x) — 93(x). This procedure 9 =49'—% 
must terminate after finitely many steps; in a simplified notation one @ = g?2.— #3 
obtains the accompanying scheme. From the last equation, the preceding $2 =43%3 — 

one, and step by step back to the first one, it is obvious that 9, is a we = 94%, — aE 
factor of y,_;, of y,_2, of y,_3 and so on, and finally also of gy’ and of 9. 2 ak =4 ? ws we 
Because y and gy’ have no common factor, y, can only be a non-zero ges : 
constant. The sequence of these functions 9, 9’, y2, 3, ---, 9, is called 9 = o ?--1 — 
the Sturm chain. Substituting for x a particular value a in the polynomials pe ue 4.0. pe j 
of the Sturm chain gives a sequence of real numbers ¢(a), y’(a),2(a), °"~* ris 

.-, 9,(a). If two neighbouring numbers 9¢,(a) and 9;,;(a) in this sequence have different signs, one 
speaks of a sign change. W(a) denotes the numbers of sign changes in the Sturm chain for the value 
x =a. Clearly the number W(x) of sign changes can alter only if the argument x passes through 
a zero of one of the polynomials of the chain 9, 9’, v2, ...,9,_;. First let one of the polynomials 
Q’, P25 +++» Pr_1 following w be equal to zero for x = &. From the above scheme it can be read off 
at once that its two neighbouring terms are not zero and have different signs. Consequently the 
number W(x) of sign changes cannot alter in passing through this value; this can only happen if x 
passes through a zero of @ itself. In this case, in fact, W(x) decreases by 1 exactly, because g has 
only simple zeros, so that » changes sign in passing through the zero, whilst 9’ is different from 
zero in some neighbourhood of this point and because of its continuity has a constant sign there. 
This proves the following theorem. 


Sturm's theorem. If p(x) is a polynomial with only simple zeros, if a < b and g(a) + 0, y(b) + 0, 
then W(a)— W(b) is equal to the number of zeros of the polynomial (x) in the closed interval 
[a, 5) 


In order to determine with the help of this theorem the exact number of all the zeros of the poly- 
nomial g(x), one chooses for x; and x, > x; numbers —M and + M whose absolute values are 
greater than the maximum of the absolute values of the zeros; thus, M > max (|o,|, |x], ..., |ox,|), 
where %,, %2, ..., X, are the zeros of g(x). For this purpose one has to determine M without knowing 
the zeros. This is possible, because the estimate 


max (|o1|, [v2], ---5 [%]) <1 + [ana] + |@n_2| + ++ + laa] + [ao] 

holds for the absolute values of the zeros of the polynomial g(x) = x" + a,_,x""1 + --- + ap in 
question. Each polynomial with a, = 1 can be normalized on dividing by a,. The zeros remain 
unchanged by this. One can therefore choose M = 1 + |a,_;| + --- + |a;| + |ao| and is then sure 
that all the zeros of g(x) lie in the interval [—M, M]. A proof of this fact would lead too far here. 
It will, however, be plausible if one thinks of the connection between the zeros x, and x, (assumed 
to be real) of ft (x) = x? + ax + b and the coefficients a and b. It is known that x, + x, = —a 
and x,x2 = b. It is clear from this that |x,| and |x2| cannot be very large, and |a| and |5| very small 
at the same time. In other words: the absolute values of the zeros cannot exceed certain bounds that 
depend on the absolute values of the coefficients. 


E: xample: In order to determine the number of real zeros of the polynomial 
g(x) = x5 — 2x* — x + 2, the Sturm chain must be calculated. 

To simplify the calculations, the chain polynomials in the given example are multiplied by 
positive numbers; this clearly does not alter the number of sign changes. One recognizes that 
g(x) has only simple zeros (and that Sturm’s theorem is therefore applicable) from the fact that 
the Sturm chain terminates with a polynomial of degree zero. 

The following summary applies to the given polynomial: 


Sturm chain | calculation | scheme signs at the interval limits 

g(x) = x¢ — 2x'-—x+2 | @—6)=— — 10360 g(+6)= + 5180 
¢ (x)= 5x* — 8x* — 1 e 5p: g c= ae — Ps ¢(—6)= + 8207 ¢'(+6)= + 4751 
Gal) = 1629/5 + 4x — 48/5 —Le 49° : Sral4—ot 9 = G01 — gs | Px(—6)= — 7244/, | oa +6) = + 705") 
Ga(x) = 25x* — 100x + 100-—L, + T2— dia— Fa | Fal —6)— + 1600 Pal +6) = + 400 
q(x) = —53x + 76- Ps=Fi—%s | He —6)—= + 394 } a(+6)= — 242 


e(e)=—16"%, | | s(—6) ="— 16"8/45 9.(+6) = — 16"),, 


For the polynomial g(x) = x5 — 2x* — x + 2, Mcan be taken to be 1 + |—2|+ |—1|+|+2|=6; 
thus, all the zeros of the polynomial lie in the interval [—6, 6]. In the summary the values for 

p(—S), v(—6), p2(—4), v3(—6), Ps(—6), Ys(—6) and for 6), ¢ (6), P2(6), 93(6), Pa(6), P5(6) 
are tabulated. By counting sign changes one obtains W(—6) = 4, W(6) — = 1; hence the polynomial 
has exactly 4 — 1 = 3 real zeros. 


124 5. Functions 


Separation of the zeros. By separation of the zeros one understands the determination of intervals 
within which exactly one zero lies. By means of the above example it will be shown how this is 
possible with the help of Sturm’s theorem. 

Substitution x = 0 in the Sturm chain belonging to g(x) = x> — 2x* — x +2 shows that 
W(—6) — W(0) gives the number of zeros in the interval [—6, 0]. Since y(0) = +2, y’(0) = —1, 
(0) = —93/5, y3(0) = +100, 94(0) = +76, y5(0) = —16°7/53, it follows that W(0) = 3. But 
W(—6) = 4, and so W(—6) — W(0) = 1; hence exactly one zero lies in the interval [—6, 0]. The 
other two must therefore lie in the interval [0, 6]. To separate them one can again halve this interval 
and examine the Sturm chain for x = 3. One finds that W(3) = 1. Because W(0) — W(3) = 2, 
both zeros must lie in the interval [0, 3], and thus they are still not separated. Halving again yields 
W(1.5) = 2. Now W(0) — W(1.5) = 1 and W(1.5) — W(3) = 1, so that exactly one zero lies in 
each of the intervals [0, 1.5] and [1.5, 3]. The three zeros are now separated. 


Extension of Sturm’s theorem. The assumption that no multiple factors occur in the factorization 
of f(x) will now be considered. For a given polynomial it is not always possible to decide at once 
whether this condition is satisfied. In spite of this one can follow Sturm’s method. If the greatest 
common divisor of f(x) and f’(x) is determined with the help of the Euclidean algorithm, then there 
are two possibilities: 

a) if f(x) satisfies the given condition, then the gcd is a constant different from zero, and Sturm’s 
theorem is immediately applicable. 

b) if f(x) does not satisfy the given condition, then f(x) has factors of the form (x — a;)i or 
(x? + a,x + b,)8s with r; > 1 or s; > 1. Then (x — «,)i-? or (x? + ajx + b,)8i-} is a factor of 
f(x), as can be seen from the product rule for differentiation. The product of these factors and 
possibly a constant c (c + 0, c + 1) then appears as the gcd of f(x) and f’(x). When f(x) is divided 
by this gcd, the quotient satisfies the assumptions of Sturm’s theorem, and the theorem can then 
be applied to it. The gcd of f(x) and f’(x) need not be further investigated for zeros, because it can 
only have zeros that are also zeros of the quotient polynomial. 


The behaviour of polynomial functions at infinity 


Besides the zeros, other special properties of polynomial functions are often of interest: extrema, 
points of inflection, gradient at zeros and points of inflection, among others. The corresponding 
investigations are made with the help of the infinitesimal calculus and will therefore not be taken 
further at this stage. It is, however, common to all these investigations that only an interval of the 
domain of definition bounded on both sides is considered. The question arises, how does the graph 
of a polynomial function look outside such an interval, what values can it take if |x| is greater 
than the maximum of the absolute values of all the zeros, extrema, points of inflection, and so on, 
of the function. The answer to this question is usually called 
the behaviour at infinity. Taking the first term a,x" out of the a,, 
expression f(x) = a,x" + a,_,x"-! + --»+a,x-+ dg leads to 


a n =! . Gn_2 i ao 
fx) = ayn" (14 Et 5 Sek 4... + 2), 
From this epeeenicion it can be seen that for unbounded 
increasing |x|, | f (x)| also increases beyond all limits, because 
the expression in the brackets tends to 1 in this case, while 
|a,x"| becomes arbitrarily large. This property is often ex- 
pressed symbolically in the Ss bale | f(x)| = co. The sign 


of the function f(x) for |x| + co ‘depends only on a,x", be- 
cause the expression in the bracket is certainly positive from 
a certain x, onwards, for all [| > Xp. There are only the 
possibilities collected together in the table. 


Example I: The function y = x* — x* — x? — x — 2 has the real zeros x = —1 and x = 2. 
A minimum of the function occurs for x ~ 1.3, and points of inflection for x ~ 0.73 and x ~ —0.23. 
For the behaviour of the function at infinity, rg f(x) = +0co and ae f(x) = + cc. By deter- 


mining some values of the function the Sollagine. table of values is obtained: 


The function can now be represented graphically (Fig.). 


5.2. Polynomial and rational functions 125 


5.2-20 Graph of the func- 
tion vy = 0.025x° + 0.05x4 
— 0.6x* — 0.55x? 

+ 2.575x — 1.5 


Example 2: The function y = 0,025x5 +- 0.05x* 
— 0.6x3 — 0.55x? -++ 2.575x — 1.5 has simple zeros 
for x = —5, x = —3 and x = 4 and a double zero 
for x = 1. Local minima occur for x = —1.53 
and x = 3.17 and local maxima for x = —4.24 and 
x = 1. Points of inflection occur for x = —3.22, 

= —0.29 and x = 2.32. (These numbers are ap- 
proximate values.) The behaviour of the function at 
infinity is characterized by lim f(x) = +co and 

X= + oo 
id lim J (x) = —ce. The following table of values 
is used to construct the graph of the function (Fig.). 


5.2-19 Graph of the function 
yo xt—xi-— xt-— x -—2 


Power functions with negative exponents 


The simplest rational functions (other than polynomials) are those whose rule of correspondence 
can be expressed in the form y = 1/x" (n = 1, 2, 3, ...). These are called power functions with negative 
exponents, because one can also write x—" for 1/x". They will be investigated first. 


The function y = 1/x. This is clearly an odd function and its graph is therefore centrally sym- 
metric with respect to the origin (Fig.). 
Table of values for y = 1/x: 


5.2-21 Graph of the function y = 1/x §.2-22. Graph of the function y = 1/x? 


126 5. Functions 


For |x| > 1, the larger |x| becomes, the nearer the ordinates of the curve approach the value zero, 
while in the region —1 < x < +1 the ordinates increase beyond all limits as |x| becomes smaller. 
The curve approaches both the positive and the negative x-axis, and also the positive and the negative 
y-axis, without reaching either of them. The x- and y-axes are asymptotes of the curve. For x = 0 
there is no function value: the function y = 1/x is not defined for x = 0. The curve consists of two 
branches; it is a rectangular hyperbola. 


The functions y = 1/x?"+!,. The shape of the curves is similar to that of the hyperbola y = 1/x. 
The functions are likewise odd. The curves are not defined for x = 0, have two branches, one in 
the first and one in the third quadrant, and all pass through the points P(1, 1) and R(—1, —1). 
They decrease in the region —1 < x < 0 and inO< x < +1, the more steeply the greater m is, 
and for |x| > 1 they approach the x-axis, the more quickly the greater m is. The x- and y-axes are 
again asymptotes. 


The function y = 1/x?. It is an even function, whose graph is symmetrical about the y-axis (Fig. 
5.2-22). For x = 0 it is not defined, and it therefore has two branches. The positive and negative 
x-axis and the positive y-axis are asymptotes. 


The function y = 1/x?™. These curves are similar to that of y = 1/x?. As to their steepness, the 
remark made about the branches of the power functions with negative odd exponents applies equally 
to them. The points P(1, 1) and Q(—1, 1) are common to all of them. 


Power functions and proportionality. Because it follows from y = kx" that for all corresponding 
values the ratio y,/xf? = y2/x3 =--- = y/x" = k = const, one says that the mth power of x is 
proportional to y. 

In the correspondence y = k/x, the smaller y is, the larger x becomes, and vice versa. Such a 
relationship is known as inverse proportion and defined by the property that the product of correspond- 
ing values is constant, xy = k. In both cases k is called the constant of proportionality. 

In free fall the distance s fallen is proportional to the square of the time; the force of attraction 
F between two masses is inversely proportional to the square of their distance r apart. Hence the 
corresponding laws must have the form s = kt? and F = m/r?, where the constant of proportionality 
can be calculated if one pair (s, ¢) or (r, F) is known. 


General form of rational functions 


As for the polynomial functions, so for rational functions there exists a representation that may 
be regarded as a normal form. 


The correspondence rule for each rational function f(x) can be expressed as the quotient of two 
polynomials p(x) and q(x) having no common factor, that is, x — f(x) = p(x)/q(x). 


If the polynomial g(x) in the denominator has degree 0, then it is a constant, and the special 
case of a polynomial function arises. In the following it will be assumed that the degree of q(x) 
is at least one, so that the rational function in question is not a polynomial. 


Zeros and poles of rational functions 


Zeros. A rational function can take the value zero only for values of x for which the numerator 
p(x) of the normal form p(x)/q(x) is zero, and at the same time q(x) is not zero. Thus, a number o« 
is a zero precisely when p(«) = 0 and q(x) +0. For an arbitrarily given rational function f(x) 
= g(x)/h(x) in which g(x) and A(x) are polynomials, it can also happen that for a number « both 
g(x) = 0 and A(«) = 0. Then « does not belong to the domain of definition, because f(«) does not 
exist. This situation arises from the fact that f(x) is not expressed in normal form: g(x) and A(x) in 
this case clearly have a common factor. A representation g(x) = (x — «)* g,(x) then exists for g(x), 
and similarly a representation A(x) = (x — «)! hy(x) for A(x), with integers k, 1 > 1. Thus, g(x) 
and A(x) have a factor (x — «)™ in common that can be cancelled in g(x)/h(x) for all x += «; m is 
the smaller of the two values k and /. There are then three possibilities: 1. for k > /, lim f(x) = 0, 


xa 


so that f(x) behaves in the neighbourhood of « as in the neighbourhood of a zero; 2. for k = J, 
lim f(x) = c + 0; 3. for k < l, lim f(x) = ©, so that f(x) behaves in the neighbourhood of « as 


x7 & xm 

in the neighbourhood of a pole (see the following paragraph). If f(x) = p(x)/q(x) is already in 
normal form, then the question of the zeros of a rational function is reduced to the question of the 
zeros of the polynomial p(x), and this has already been answered. 


Poles. The function f(x) = p(x)/q(x) is said to have a pole at the point x = « if g(«) = 0 and p(«) 
+: 0. If the linear factor (x — «) occurs k times in the factorized form of q(x), q(x) = (x — «)*q,(x), 
then one speaks of a pole of order k. In the neighbourhood of this pole the function f(x) can be 
P(x) 1 _ P&) 


expressed in the form f(x) = q(x) (x—ayr qy(x) 


. If p(x) and g(x) have no common factor, 


5.2. Polynomial and rational functions 127 


then neither p(x) nor q,(x) has a zero in a neighbourhood of x = «, hence they do not change their 
signs, and their quotient therefore has a bounded positive or negative (non-zero) value. But the func- 
tion 1/(x — «)* increases without limit as x —-> «. If one approaches the pole in the sense of increasing 
values of x (x < «), then (x — «) is negative and 1/(x — «)* tends to —oco for odd values of k and 
to + co for even values of k. If one approaches the pole in the sense of decreasing values of x (x > «), 
then (x — «) is positive and so 1/(x — «)* always tends to +-co. This property of the function 
1/(x — «)* is changed by the factor p(x)/q1(x) only to the extent that for negative values of the factor 
the sign of the function f(x) is reversed (Fig.). The line x = « is an asymptote of the function. 


' 
| 
l 
l 
! 
I 
| 
l 


Zn En eee es es a a 


5.2-24 Graph of a function 
with poles of even order 


—— So 


5.2-23 Graph of a function with poles of odd order 


The behaviour of rational functions at infinity 
Beginning with the general form 


f(x) —_ P(x) a Amx™ = Oe, ia a as = Qix a Qo 
q(x) byx” + by_yx"* + ++) + byx + bo ’ 


there are three possibilities to consider, namely m <n, m = n,m > n. If the degree of the numerator 
polynomial p(x) is greater than or equal to the degree of the denominator polynomial g(x) (m > n), 
the function f(x) is called an improper rational function. Dividing the numerator by the denominator, 
a polynomial function g(x) of degree (m — n) can always be split off: 


f(x) = p(x~)/a) = a(x) + r(x). 


In the case m = n, g(x) is the constant a,,/b,. The remainder r(x) is always a proper rational function, 
that is, the degree of its numerator is less than that of its denominator. But the behaviour of a 
polynomial function at infinity is known; it only remains to investigate that of the proper rational 
function. Dividing the numerator and denominator by x” (m <_n), one obtains: 


fay — Smt dmaalx ++ + ay]x"* + aol x™ 
Byxt™ FE -- F By/x™—t FE Bg x™ 


As |x| — oo, the numerator tends to the value a,,, while at the same time the absolute value of the 
denominator can take arbitrarily large values; thus, | f(x)| — 0 as |x| + co. Consequently the x-axis 
is an asymptote of the function f(x). According to the signs of a,, and b, and the degree n — m, 
the graph of the function approaches the x-axis asymptotically from above or below, through positive 
or negative values. For example, if a,, > 0, 6, > 0 and 2 — m is odd, then f(x) is positive as 
x -—» +oo and negative as x -- —oo. The corresponding result holds for the remainder r(x) after 
the polynomial function g(x) is split off from the improper rational function f(x). The function f(x) 
approaches the function g(x) asymptotically as |x| co, from above if r(x) has small, but positive 
values, and from below if r(x) tends to zero through negative values. The graph of the function g(x) 
is called the limiting curve. In particular, if m = n and g(x) = a,,/b,, the parallel to the x-axis at a 
distance a,,/b, from it is an asymptote of the function f(x) for |x| — oo. 


Example 1: The function y = wots has a zero for x = 2, a pole of the first order for 
x = —2 and a pole of the second order for x = 1. Two local extrema, both maxima, occur for 


x es —0.74and x » 2.74. The behaviour of the function at infinity is characterized by lim y= 0. 
Thus, the x-axis is an asymptote of the function curve. To consider the sign of the function values 


128 5. Functions 


for the whole domain of definition one writes the equation of the function in the more useful form 

a 3(x — 2) 
*~@— IF @+2) - 
—2< x< land 1 < x < 2, and again positive for x > 2 (Fig.). For a more precise graphical 
representation a table of values is needed: 


It is easily seen that y is positive for —co < x < —2, negative for 


).2 
| 
| 

| 

| 

| 

| 

| 


3x — 6 xi — | 
§.2-25 Graph of the function »y = prea} 5.2-26 Graph of the function y = 


Example 2: The function y = (x? — 1)/(x? + 1) has zeros at x = —1 and x = 1; it has no 
poles. An extremum (minimum) occurs for x = 0, and points of inflection for x ~ —0.57 and 
x 0.57. By division one obtains y = 1 — 2/(x? + 1). The line with the equation y = 1 is an 
asymptote of the function curve, because lim y = 1. It is easy to see, in addition, that the graph 

|x| o0 
of the function lies entirely below the asymptote (Fig.). 


Table of values: 


= Example 3: The function y = (x? — x — 2)/(2x — 6) 
- has zeros x = —1 and x = 2, a pole at x = 3, and 
“ local extrema for x = 1 (maximum) and x = 5 
(minimum). Separation of the polynomial part yields 
the representation y = x/2 + 1 + 4/(2x — 6). Thus, 
y = x/2 + 1 is an asymptote of the function curve. 
The curve approaches the asymptote from below as 

y Xx -—» —oo and from above as x + +o (Fig.). 


=e — 


5.2-27 Graph of the function y = a er 


_— oe eS ee 


Example 4: For y = f(x) = (x° + 2)/(2x) the curve y = x?/2 is the limiting curve, because 
(x3 + 2)/(2x) = x2/24+ 1/xand lim [(x? + 1)/(2x) — x?7/2]= lim (i1/x) =0. 


|x|— + 00 x= + 00 


Decomposition into partial fractions 


To integrate a rational function f(x) one has to express it as a sum of partial fractions. In its 
normal form f(x) = p(x)/q(x) the numerator p(x) and the denominator q(x) have no common factor. 
If the degree of the numerator p(x) is greater than or equal to the degree of the denominator, then 


5.2. Polynomial and rational functions 129 


a polynomial function g(x) can be split off by division, f(x) = g(x) + pi(x)/q(x). The denominator 
q(x), for its part, can be expressed as a product of Jinear factors, except for a constant factor that 
can be incorporated into the numerator: 


g(x) = (x — oy (x — oa) (Xe — ong) (X— By) (XB)...  — Bo (x — Bde 


in which the k real zeros «; with multiplicities r;, and the / pairs of conjugate complex zeros f, and 
B, with multiplicities s; occur. The product of two conjugate complex linear factors is a real poly- 
nomial of the second degree; (x — B) (x — B) = x? — (6+ B)x + BB = x? + ax + b. The sub- 
stitutions a = —(8 + A) and b = BB have been made here. Then q(x) can be expressed as the fol- 
lowing product of polynomials that are irreducible in the domain of the real numbers: 


Q(x) = (x — 1) (xe — Og) oe CX — Op) (x? + ayx + by)8t ++ (X? + ayx + 5y)§t. 


Proper fractions whose denominator is a power of a linear factor or of an irreducible polynomial 
of the second degree are called partial fractions; in the first case the numerators are constants A, 
and in the second case linear polynomials B + Cx. The proper rational function p,(x)/q(x) can be ex- 
pressed in the following form: 


Here A,;, Bij, Ci; are real constants. That such a decomposition exists when the denominators 
are powers of linear factors can be seen as follows. Let « be a zero of order r of the denominator 
q(x), so that q(x) = (x — «)? q,(x), where « is not a zero of q,(x). If the partial fraction A/(x — «)’ 
is separated from the given proper rational function p,(x)/q(x), one obtains: 

P(x) A __ p(x) — Aqi(x) _ P(x) 


(x— al q(x) (K—a) (ak agi(x) (ay qi) | 
Because neither p,(x) nor g,(x) is zero for x = «, the undetermined constant A may be taken to 
be the number p,(«)/q,(«). This ensures that the function ®(x) = p,(x) — Aq,(x) has a zero for 
x = a, and hence ®(x) = (x — «) (x). Cancellation gives 


Pi(x) A g(x) 


(x — a) qi(x) (x — a) (x — a)? qu(x) © 
This function is again a proper rational function, because the degree of g(x) is 1 smaller than that of 
@(x), whose degree is at most equal to that of p,(x) or of q,(x), and in any case less than that of 
q(x) = (x — a)’ q,(x). By the same procedure a partial fraction A,/(x — «)’"1 can again be split 


off from the function eo 
denominator g(x). (x — ay" qi(x) 

If one admits complex numbers temporarily, then the same considerations hald also for zeros B 
and f of the denominator g(x). One must remember that substitution of 8, the conjugate complex 
of £, in either of the functions p,(x) and q(x) with real coefficients, gives rise to the conjugate complex 
values: thus, A, = p,(8)/q:(8) = 6,(8)/4:(8) = A1. Hence for each partial fraction A/(x — B)’ 
A(x — By + A(x — BY 
(x? — [8 + B) x + BBY 
each complex number is replaced by its conjugate, which means that it must be real and must 
have the form h(x)/(x? + ax + 6)’, where A(x) is of degree at most r. If its degree is greater 


. A similar result holds for the other real zeros of the 


the partial fraction A/(x — 8)’ also occurs; their sum is unchanged if 


130 5, Functions 


than 1, then A(x) can be divided by (x? + ax + 5), giving 


h(x) = hy(x) (x? + ax + 6) + (Bx + C) or 
h(x) _ Bx+C 4 hy(x) 
(x? +ax+ bl (x? + ax+4 by (x? +.ax+ by * 
If the degree of /,(x) is again greater than 1, then it can again be divided by (x? + ax + b). This 
decomposition is unique, as can be seen by multiplying by (x — «,)'4 and then equating coef- 
ficients. 

Decomposition into partial fractions in practice. There are various possibilities for carrying out 
in practice the decomposition into partial fractions of a rational function. One can, for example, 
proceed step by step as in the proof given above. However, another method is usually more convenient, 
which will be explained by the following examples. It is the method of undetermined coefficients. 


me aed 
(x + 2)? (x — 1) | 
From the general theorem one knows that the decomposition must have the form: 


Example I: The function y = is to be expressed as a sum of partial fractions. 


2x —1 Pare. ae wet A; 
Glo D G4 & so wr. 


If one multiplies this equation by the denominator (x + 2)? (x — 1), it follows that 2x — 1 
= A,(x — 1) + A2(x + 2) (x — 1) + A3(x + 2)*. Removing the brackets in this identity and 
collecting together like powers of x leads to 2x — 1 = (A, + A3)x* + (A, + Az + 4A3)x 
4+. (—A, — 2A, + 4A). Equating coefficients one obtains the following system of equations for 
the determination of A4,, Az and A;: I. Az + A, = 0; II. A, + Az + 4A, = 2; ID. —A, — 2A2 
+ 4A, = —1. It has the unique solution A, = 5/3, A; = —1/9, A; = 1/9. Hence the required 
decomposition into partial fractions is: 
2x — | gh 7 1 : 1 
@+2%@—1) 3e+2° 9@+2) Xe—1)” 


Example 2: The decomposition into partial fractions of y = p,(x)/q(x) with p,(x) = x? + 5x 
and g(x) = x* — 2x3 + 2x? — 2x + 1 =(x — 1)? (x? + 1) starts out from 
x? + 5x au Ay ; As B+ Cx 
@—I¥@+). @-it ' @-lh' @FD 
nator one obtains x? + 5x = A,(x? + 1) + A2(x? + 1) (x — 1) + (B + Cx) (x — 1) and fur- 
ther x? + 5x = (4, + C)x3 + (A, — A2 + B— 2C) x? + (A; — 2B + C)x+ (A, —A2+4+ 8B). 
The coefficients A,,A,,8 and C must satisfy the following system of equations: 
I. A,+ C=0; II. Ay — Az + B-— 2C =1; Il. A, —2B+ C= 5; IV. A; — Az + B=0. 
One finds that A; = 3, A, = 1/2, B= —5/2 and C = —1/2. Hence the required decomposition is: 


. Multiplying by the common denomi- 


x? + 5x = 3 l I+x 
Poot Heel GH We) 26D 


5.3. Non-rational functions 


The definition of non-rational functions — also known as irrational functions — is given already 
by their name: they are functions that are not rational. 


Root functions 


The function y = /x. Corresponding to the definition of the square root, the function y = yx 
is defined only for non-negative values of x; if the maximal domain 0 < x < -++oo is chosen as 
domain of definition, then the range is 0 < y < -+oo. It follows further from the definition that 
y = x is the inverse function of y = x? in the interval 0 < x < +-oo. The graph of the function 
y = yx can be constructed with the help of a table of square roots or by taking the mirror image 
in the line determined by y — x of the “half” parabola given by y = x? in the intervalO < x < +00 
(Fig.). The inverse function of y = x? in the interval —co < x < Oisy = —)x, with0< x < +00, 


3 
The function y = /x. Strictly speaking, in the first instance this function is also defined only 
for non-negative values of the argument. In the interval of definition 0 < x < -++00 it is the inverse 
of the function y = x? with y > 0 and hence x > 0. Its graph can again be obtained by taking the 
mirror image in the line y = x of the “half” cubical parabola y = x3 with 0 < x < +00 (Fig.). 


5.3. Non-rational functions 131 


‘ 
i 4 | | 
| ; 


| i j a Perera 
| | feu Gia | Hug 
feria ne eure ELSES aT HE 
i 
a ata HE Get Baie em 
ETB Gt in vi ez ks ti CE» fb 
| TEES? | 
— = = Sraiitit = iewed be fo ya Vx 
| WA | 
ane : 
SBA EH a sg 7 ee HL 
Peper --—4|-_+—+—— 
7 ae | 
| 
meee | 53! SSz5 ESSEs SEs: | 
t r 
5.3-1 Graphs of the functions y = x and 
y = x? as mirror images 5.3-2 Graph of the inverse function of y = x3 


On the other hand, the inverse function of y = x? with y < 0, and hence x < 0, is described by the 


3 
equation y = —Yy(—x) in the interval of definition —co < x < 0 (see the dotted curve in the figure). 
Consequently two equations are needed to describe explicitly the complete inverse function of y = x3, 
which must exist because y = x° is one-to-one. But the two parts are usually subsumed under the 


single expression y = yx for all real values of x (see Chapter 3.). 


n 
The function y = ¥x. According to the general definition of a root, 1 is assumed to be an integer 
greater than 1. The cases nm = 2 and 2 = 3 have already been considered; the investigation of greater 
values of 7 yields nothing essentially new. For the domain of definition 0 < x < +00 and range 
0< y< +00, all the functions increase monotonically, but differ from one another in that for 


n m n m n 
n<m, ¥x< yx for O0< x<1, and yx >yx for x > 1. The functions y = /x are inverses 
of the power functions. If 7 is even, then y = x" is monotonic increasing in the intervalO < x < +00, 


n 
and consequently is invertible there with the inverse function y,; = )x; in the interval (—©o, 0], 


n 
the functions y = x" and y = —Yyx are inverses to one another (Fig.). For odd values of n, y = x" 
is monotonic increasing in the whole domain of definition (—oo, +00) and has the inverse function 


x = y", whose explicit form is y+ = Vx for OX x < +00 and y= —y(—x) for —o < x<0 
(Fig.) or y = yx for all real values of x. 

5.3-3 Graphs of the functions 

y= +)Vx,y = —Vx, y =Vx and y = Vx 


Deer Fen | 
Bde --x | 
5.3-4 Graphs of the functions 


y=Vx,y= -V(—»), y Ve and y = —V(-x) 


Exponential functions 


The e-function y = e*. The correspondence rule can be given by means of an explicit expression 
containing infinitely many rational operations, from which the function value for every real (or 
complex) value of x can be calculated with arbitrary precision by substitution in the series 


132 5, Functions 


y=et=14 x/1! + x?/2! 4+ x3/3!+4---. For the particular value x = 1 the value of the trans- 
cendental number e = 2.718 281 828 459 ... is obtained. Some tables of logarithms contain rounded 
values of this function. 


Table of values of the function y = e& 


By the rules of fefuseee e° = 1 aad e-* = 1/e*. Because the function y = e* assumes only positive 
values for positive values of x and increases monotonically without limit as x — oo, it follows that 
the function y = e~* also assumes only positive values, and that the values of y decrease monoton- 
ically with increasing argument x. As x — +00, the curve approaches the x-axis asymptotically 
(Fig.). 

The exponential function is often called the natural growth function, because many natural processes 
lead to this function. If the rate of increase or decrease +dN/dt at time ¢ of a number N of objects 
under consideration is proportional to the number N itself, so that +dN/dt = Nk, where k is the 
constant of proportionality, then dN/N = +k dt, or e+*' = N. For example, the growth of a 
forest, the growth of the population of the earth, and radioactive decay are based on this function. 


5.3-5 
Graphs of the functions y = e* and y = e-* 
| : 


5.3-6 
Graphs of the logarithmic and exponential functions 


The function y = a*. By the rules of indices, a = e!"*, because the number Ina is the power 
to which e must be raised to give a. One obtains y = a* = e*!"°, that is, the general exponential 
function is an e-function y = e**, whose interval of definition has been stretched or contracted 
uniformly by the constant factor kK = In a. The interval from x to x + 1 then no longer is of length 1, 
but 1-k = Ina = lga/lge = 2.30259... lg a. This value is, for example, less than 1 (kK = 0. 693) 
for a = 2 and greater than 1 (k = 2.30) for a = 10. If the argument x of e* increases by 1, the ar- 
aa kx of the function y = 2* = e*!"? increases by only 0.693, but that of the function 

= 10* = e*!"19 increases by 2.30. The function y = 2* increases more any and the function 
4 = 10* more quickly than the e-function e* (Fig.). _————— 

In general, y = a* increases more slowly than y = e* for 1 << a < e, and 
more quickly for a > e. 


Table ‘si values for the functions y,; = 2* and yz = 10* 


ae ie ba } Oo | 1/3 | 1/2 


oe ee ee 
comif eater pep [= [ey 


In the ‘igen of the seat mie the pear me is defined neither oe a= fe nor for ae values 
of a; the general exponential function y = a* = e*!"° therefore exists only for positive values of 
the basis a. The nearer the basis is to the value 1, the flatter is the curve of the exponential function; 
for a = 1 the function value y = 1 belongs to every arbitrary value of x. The function curve is then 
a line parallel to the x-axis. For 0 <a < 1, y = a* = (1/a)~*, so that the graph of y = a* is obtained 
by taking the mirror image in the y-axis of the function curve of y = b* with b = 1/a> 1. 


5.3. Non-rational functions 133 


The functions y = k - a*. On account of the constant positive factor k, the ordinate values y of 
the function are stretched (k > 1) or contracted (k <1) in the ratio 1: k. It can be shown that in 
this way the curve goes into itself up to a translation parallel to the x-axis. Because k = e!"*, it 
follows that y= k-a* = e!"*. ex ina — exinatink, so that there is a translation parallel to the 
x-axis by an amount c = —Ink. 


Logarithmic functions 


The function y = log, x. This function is the inverse of the exponential function y = a*, which 
is, of course, monotonic in its whole domain of definition. Because the range of the exponential 
function is 0 < y < + ov, the logarithmic function can be defined only for positive values of the 
argument, and thus has the domain of definition 0 < x < + co. Special inverse functions are 
y = Inx for y = e*, and y = lg x for y = 10*. Consequently their graphs are obtained by taking 
the mirror images of the graphs of y = e* and y= 10 in the line y = x (see Fig. 5.3-6). Informa- 
tion about y = lg x and about the slide rule are contained in Chapter 2. 


The function y = log, x“. Clearly y = k log, x, and the function values are therefore given by 
multiplying by the constant k. Its value can also be negative, because for k = —x with x > 0 one 
obtains y = log, x-* = log, (1/x*) = —log, x* = —x log, x. For k = —1 in particular, the graph 
of the function is the result of taking the mirror image in the + x-axis of the graph of the function 
y = log, x. Thefunction y = —log, x = log, x~! = log, (1/x) is the inverse of the function y = a-*. 


The function y = log, (kx). For positive values of the constant k, because y = log, (kx) = log, k 
+ log, x, the graph of this function is obtained from that of y,; = log, x by translation parallel 
to the + y-axis through a distance d = +log, k. For negative values x = —k(k > 0) the function 
is defined only for negative values of x, and then because xx = |kx| the function values are the 
same as those of y = log, (kx) withO << x < +00. 


Trigonometric and circular functions 


The trigonometric or angular functions and the circular or arc functions are a very frequently oc- 
curring type of irrational function. They are investigated more thoroughly in Chapter 10. 


Connections between trigonometric and circular functions. Because the circular functions are 
defined as inverse functions of the trigonometric functions, the following connections are immediate: 
sin (arcsin x) = x; cos (arccos x) = x and so on. For positive x, if the principal value is taken on 
both sides, it is also true that Arccot x = Arctan (1/x), because cot x = 1/tan x. For this reason 
the function y = arccot x can be dispensed with for most investigations. The following further 
relations are of interest: 


It is enough here to give the justification for the first relation, because the others can be obtained 
in exactly the same way. From sin? y + cos? y = 1 it follows that sin y = +/(1 — cos? y), or 
sin (Arccos x) = +)(1 — x?), for the principal value of arccos x, 0 < Arccos x < 2. In this domain 
of definition the sine function has no negative values, sin (Arccos x) > 0; the square root can 
therefore have only a positive sign, and 
sin (Arccos x) = + /(1 — x?), as given in 
the table. Similarly for the principal value 
—n/2 < Arcsin x < +2/2 it follows that 
cos (Arcsin x) = +)(1 — x7), because 
the cosine function takes no negative 
values in this interval. Using a connection 
between the trigonometric functions and 
the exponential function, first found by 
EULER, a relation between the inverse tri- 
gonometric functions and the logarithmic fneboue can be obtained. These relations are valid in the 
domain of complex numbers. 

From the relations derived above, for sing = x and the principal value Arcsin x = @ the first 
Euler formula becomes e'? = ix + (1 — x*). By taking the logarithm one obtains ip = i Arcsin x 
= In (ix + Y(1 — x?)). By similar calculations for the other inverse functions the following relations 
are obtained: 


134 5. Functions 


Hyperbolic functions 

The correspondences defined by the following function equations are called hyperbolic functions: 

1. Hyperbolic sine: y = sinh x = (e* — e~*)/2; 2. Hyperbolic cosine: y = cosh x = (e* + e-*)/2; 
— a= x —x 

3. Hyperbolic tangent: y = tanh x = —a ; 4. Hyperbolic cotangent: y = coth x = nae : 

The function y = sinh x. From the definition it follows that y = (e* — e~*)/2 is defined for all 
values of x. The function has one zero at x = 0. If x ~ + co, e-* becomes arbitrarily small. Because 
at the same time e* increases without limit, the values of the function become arbitrarily large. 
As x-»—oo, e-* becomes arbitrarily large and e* approaches the value zero, so that the 
value of the function tends to —oo. From the defining equation it follows further that sinh x 
== —sinh (—x). Thus, the function is odd and its graph is centrally symmetric about the origin of 
coordinates; the range is —co << y<. +00 (Fig.). 


The function y = cosh x. This function is also defined for all values of x and its range is given 
by 1 < y < +00, as can be seen from the equation of the function y = (e* + e~*)/2. The function 

: is even and its graph is symmetrical about the y-axis (Fig.). 
The shape of a heavy chain or rope hanging under its own 
weight is given by the graph of the function y = acosh (x/a), 
where a is a suitable positive constant depending on the 
material (see Fig. 19.5-11). 


5.3-7 Graphs of y = sinh x and y= cosh x | 5.3-8 Graphs of the functions | y = tanhx and y = cothx 


The functions y = tanh x and y = coth x. The first of these two functions is defined for all values 
of x, but for the second the value x = 0 must be excluded. The range of y = tanh x is bounded: 
—1< y< +1. By contrast, the range of y = coth x is by —oo < y< —1 and +1 < y< +o. 
Both functions are odd (Fig.). 


Relations between the hyperbolic functions. From the equations of the functions the following 
identities are immediate consequences: 


The close similarity of these relations to those between the trigonometric functions justifies the 
use of the terms hyperbolic sine, hyperbolic cosine and so on. The reason one speaks of hyperbolic 
functions is clear when one considers the third relation. Putting cosh x = X and sinh x = Y, this 
relation becomes X? — Y? = 1. This is the equation of a hyperbola in the X, Y-plane, but of course 
only the right-hand branch is represented because cosh x > 1. 


The inverse functions of the hyperbolic functions 


The hyperbolic functions are invertible. For y = sinh x and y = tanh x this can be seen from 
their graphs; for y = coth x this follows because the function is one-to-one. For y = cosh x an 


5.3. Non-rational functions 135 


inverse function can be defined in each of the two intervals —co < x <0 and O0O<x< +o in 
which the function is monotonic. 


Inverse hyperbolic sine, y = sinh~! x. This is the inverse function of y = sinh x. Solving the 
equation y = (e* — e~*)/2 for x gives first 2y = e* — e~*, or after multiplying by e*, 2ye* = e?* — 1, 
or e?* — 2ye* — 1 = 0. This is a quadratic equation for e*. Only the solution e* = y + )(y? + 1) 
is relevant, because y — /(y? + 1) is always negative, whereas e* can take only positive values. 
Taking logarithms finally yields x = In [y + /(y? + 1)]. From this one obtains at the same time 
y =I1n [x + V(x? + 1) as the explicit form of the equation of the inverse function; it represents 

= sinh~! x. The graph of the function y = sinh"! x is the mirror image in the line y = x of the 
curve with the equation y = sinh x. 


Inverse hyperbolic cosine, y = cosh~! x. To invert y = cosh x one proceeds as for y = sinh x 
by similar steps to the equation e?* — 2ye* + 1 = 0, leading to e* = y + /(y? — 1). Finally one 
obtains the inverse functions y = In[x — /(x? — 1)] for the interval —o <x<0, and 
y = In [x + V(x? — 1)] for the interval 0< x< +00. For both the domain of definition is 
1<x< + ov, and the range is — oo < y < Oin the first case and 0 < y < +00 in the second. 
The inverse function with the range 0 < y < -+ oo is called the principal value of the inverse hyper- 
bolic cosine, which therefore has the explicit form y = cosh~! x = In [x + (x? — 1)]. 
= Ee 
ex + e-* ’ 
first multiplying numerator and denominator by e*, one obtains y = (e?* — 1)/(e?* +1), so that 
ye?* + y = e?* — 1], or ye?* — e?* = —y — 1. Multiplying by (—1) and collecting together the 
terms in e?* leads to e?*(1— y)=1+4+y, or e* =(14+ y)/(1 — y), F&F =Vli + y)/d — y)). 
Taking logarithms and finally interchanging the variables gives the explicit equation y = 
Iny((Qi + x)/(1 — x)] or y = (1/2) In [C1 + x)/(1 — x)] for the function y = tanh-! x. The domain 
of definition is limited to —1 < x < +1, but the range consists of all real numbers. 

For completeness the equation of the inverse hyperbolic cotangent, y = coth™! x is given here: 
y = (1/2) In [(x + 1)/@& — 1)], with domain of definition —oo << x << —land1<x< +oo. 


Inverse hyperbolic tangent, y = tanh~! x. From the equation of the function y = 


Geometrical significance of the inverse functions. The inverse functions of the trigonometric 
functions can be regarded geometrically as the arc lengths for which the functions sine, cosine, 
tangent or cotangent have the given value x. If this arc length is taken as parameter ¢, then x = cos ft 
and y= sinf¢ represent a point on the unit circle in the 
x,y-plane, because of the relation cos? ¢ + sin? t = x? + y? 
= 1. Similarly a parametric representation x = cosht and 
y = sinht can be introduced for the Ayperbolic functions. Be- 
cause cosh? ¢ — sinh? t = 1, the point P(x, y) lies on the rec- 
tangular hyperbola x? — y? = 1 (Fig.). Itcan be shown that in 
this case the parameter ¢ represents twice the area between the 
segment |OV| of the x-axis, the arc VP of the hyperbola up to 
the point P(x9, Yo) and the line joining P to the origin O. With 
the help of integral calculus this area is obtained as follows: 


Area VAP = [Ve — 1) dx 
= "2x9 V(x§ — 1) — */2 In |xo + VS — DI, 

Area OVPB = Vy? + 1) dy 
= */2¥o V(¥s + 1) + */2 In |y¥o + VO + DI- 5.3-9 The geometrical significance 
From this the area fo/2 = OVP can be calculated in two ways. Of the inverse hyperbolic functions 


1. OVP = OAP — VAP = 1/2xo¥q — VAP 
= 1/2x9 V(x8 — 1) — */2x0 VOB — 1) + */2 In |xo + V(xB — 1)| 
= */2 In |xo + Vxs — DI. 
2. OVP = OVPB — OPB 
= "ayo V3 +1) + 2/2In| yo + V3 + DI — 7/290 VWO98 +: = */2 In| yo + VOU F VI- 


As was found in considering the inverse hyperbolic cosine function, 1/2 In |xo + V(x3 — 1)| 
= !/, cosh! xg, so that fg = cosh~! x9. On the other hand, in considering sinh~! x it was shown 
that * hs In |¥o + Viv + 1)| = ge sinh Yo; SO that to = sinh7! Yo- 


136 5. Functions 


5.4. Functions with more than one independent variable 


General definition 


If M,, M2, ....M, are n non-empty sets, not necessarily distinct from one another, then one 
can select one element from each set, say x; from M,, x2 from M), ..., x, from M,, having regard 
to the order of the sequence. The set (x,, x2, ..-, x}) of all these elements is called an n-tuple. If 
exactly one element of a set N is assigned to each n-tuple ordered according to the given sequence, 
then one speaks of a function of n independent variables; in general this is written y = f(x, , X2,---, Xn). 


Real functions with two independent variables 


In the following functions the domain of definition consists of ordered pairs of real numbers, 
while the range is contained in the set of real numbers. In general form this is usually written as 
z = f(x, y), where z is used for the dependent variable, and x and y for the independent variables. 


Representation of the domain of definition in the plane. The domain of definition of a real function 
of two independent variables may have geometrical significance. Because each ordered pair of real 
numbers corresponds to a unique point of a plane provided with a coordinate system, and conversely, 
plane regions of the most diverse shape can serve as domains of definition; for example, the do- 
main may be connected. or at the other extreme. it mav consist of isolated points only. 

Example I: The domain of defini- 
tion given by —co < x < +00 and 
0=y< +oco corresponds to the 
upper half-plane of the x, y-plane, 
including the x-axis (Fig.). 

Example 2: With the condition 
x? + y? < 1, the domain of defini- 
tion is the interior of the unit circle 
(Fig.). 

Example 3: The conditions 


(2) 
—o < x<—1 orl <x< +00 and 2.4-1 Geometrical representations of the regions given in 


—co< y<-+oo, together with ‘he examples 


—l<x< +1 and +1 < y< +00 
or —co < y = — I, determine as domain of definition the whole plane apart from the interior 
of a square (Fig.). 


Representation of functions in space. Because altogether three variables occur, one uses for the 
geometrical representation a space coordinate system with three axes, usually a rectangular system. 

To each ordered triple of real numbers there corresponds exactly one point in the space coordinate 
system, and vice versa. On the basis of this unique correspondence all real functions of two in- 
dependent variables can be represented geometrically. If the number Zp is associated with the pair 
(Xo, ¥o) by means of the function z = f(x, y), then this corresponds geometrically to the point Po 
hie the coordinates (x9, Yo, Zo). The function is assumed to be such that its graph represents a 
surface. 

The question now arises, how can one obtain in an individual case an idea of the nature of the 
particular surface. It would, of course, be possible in principle to draw up a table of values and 
hence produce a drawing. However, to arrive in this way at an intuitive picture, the table of values 
would have to be very extensive. In practice, therefore, one usually makes use of other methods; 
for example, to determine possible extrema, saddle points, and so on, one draws on the results 
of the differential calculus. This will not be gone into more deeply here. One certainly obtains con- 
siderable insight by keeping one of the three variables of the function constant. One selects, for 
example, all those pairs (x, y) from the domain of definition, whose x-value is equal to a fixed 
number c. From z = f(x, y) one obtains the function equation z = f(c, y), which now contains only 
one independent variable. Its graph is a curve, the curve of intersection of the surface determined by 
z = f(x, y) with the fixed plane given by x = c. If one constructs these curves for different fixed 
values of x, one obtains a family of curves that can give an approximate picture of the surface. 
Naturally the same method can also be applied to the variable y. The matter is slightly different 
if the dependent variable z is kept constant. Each special value of z then leads to an equation with 
two variables: z = f(x, y) leads to f(x, y) = cc. The set of solution pairs (x, y) that satisfy this 
equation is interpreted geometrically as a point set in the plane given by z = c. Moreover, if one 
postulates that c belongs to the range of the function, this point set is not empty. In general, it forms 
certain curves. Because one usually thinks of the z-axis as being vertical, these curves are called 
contour lines or level curves. In principle they are the same as the contour lines in a geographical 
map. In both cases the points lying on them are those that are at the same height above or below 


5.4. Functions with more than one independent variable 137 


some standard level; in geographical maps this is usually sea level and in the case in question it is 
the x, y-plane. 


Example 1: The function z= x + » is defined in the whole x, )-plane. Its range is clearly 
—oo < z< +00, If x is kept constant, then to each special value of x there corresponds an 
equation of the form z = y + ec. Geometrically this represents a family of straight lines. If one 
investigates the contour lines given by x + y = c, one obtains a family of parallel straight lines. 
The graph of the function z = x + y is a plane (Fig.). 


<< 


a a 


— — ' 5.4-3 Geometrical representation of the 
5.4-2 Geometrical representation of the function z = x + y function z = ) (4 — x? — y?) 


Example 2: The function z = \/(4 — x? — y?) is defined only in the region x? +- y? < 4; thus, 
the domain of definition is a circle with centre at the origin and radius 2. Its range is bounded: 
0 <z<2. Keeping x constant leads to an equation of the form z = )/{(4 — ec?) — y?}. This 
represents geometrically a semicircle. The result is the same if y is kept constant. The contour 
lines are given by c = (4 — x? — y?), which can be rearranged in the form x? + y? = 4 —’c?. 
This represents a circle of radius (4 — c*). Consequently the geometrical representation of 
z= \/(4 — x* — y*) is a hemisphere (Fig.). 

Example 3: The function z = xy is defined in 
the whole x, y-plane. Its range is —co< z< + co, 
Keeping x constant, one obtains the functions 
z= cy whose graphs are straight lines. Of course 
these are not parallel, as they were in Example 1. 
The same result is obtained when y is kept 
constant. In this case the contour lines are 
hyperbolas with the equation xy = ¢ (c + 0). 
For c = 0 the x-axis and the y-axis are obtained 
as contour lines (Fig.). 

If the surface is cut by a plane perpendicular 
to the x, y-plane, then the curves of intersection 
are parabolas, provided that the intersecting 
plane is not parallel to the x, z-plane or to the 
y, z-plane; in these excluded cases the intersec- 
tions are the lines already found above. 

This can be seen as follows: The intersecting 
planes all have equations of the form Ax-+ By 
+ C= 0, which can be rearranged in the form 
y = ax + 5b. Substituting the expression ax + 5 
for y in z= xy, one obtains z = ax? + bx. 
But these are equations of parabolas. Their 
vertices lie in the planes x = y or x = —y. 

The function z = xy represents geometrically a surface known as a hyperbolic parabola. 


5.4-4 Contour lines of the function z = xy 


Occurrences in other fields. Real functions of two independent variables are used to express not 
only mathematical, but also physical and technological relationships, among others. Examples 
are: area formulae such as A = aband A = gh/2; volume formulae such as V = ar?h and V = a?h/3; 
solution formulae for equations such as x = —p/2 + //(p?/4 — q); formulae for Ohm’s law I = E/R 
or for the connection s = vt between distance, speed and time; formulae for the cutting speed of 
lathes v = x dn/1000, and so on. The function formulae can also occur in implicit form, in which 


138 5. Functions 


it is not specified in advance which variable is to be regarded as dependent. An example of this is 
given by the equation of state for ideal gases pV,, = RT, giving the mutual dependence of pres- 
sure p, volume V,, (volume of a mole of gas) and absolute temperature 7; R is the absolute gas 
constant. Each of these three variables can be regarded as dependent. One must just take into 
account that from physical considerations only positive values of the variables are applicable here. 
Curves resulting from keeping 7 constant are called isotherms and are in principle just like the con- 
tour lines of the function z = xy for positive values of x and y. 

A somewhat more complicated example is the van der Waals’s equation of state for real gases, 
(p + a/V?) (V — b) = RT, in which a and 5 are constants depending on the particular gas. 


Real functions with independent variables 


In the following some general properties and also some special cases of this kind of function 
will be considered, without giving an exhaustive survey in a systematic way. 


Domain of definition and representation of the functions. If the domain of definition consists of 
ordered triples of real numbers, then it can still be represented geometrically. It can then, in general, 
be regarded as a region in an x, y, z-coordinate system in space. The function as a whole is then 
expressed geometrically as a single-valued correspondence that assigns to each point in space in the 
domain of definition a particular number. Such functions occur, for example, in physics in the descrip- 
tion of electric, magnetic, or of gravitational fields. For the numbers corresponding to the points 
in space, the concept of potential is used. Points with the same potential form the so-called equi- 
potential surfaces, which represent essentially the same thing as the contour lines for functions 
of two independent variables. 

If one is concerned with functions having more than three independent variables, then a geometrical 
representation in the sense so far discussed — that is, a visual realization of the function — is no longer 
possible. 


Symmetric functions. A real function of n independent variables is said to be symmetric if one 
can interchange the independent variables arbitrarily among themselves without thereby altering 
the function. The most important are polynomial and rational symmetric functions. A polynomial 


. ; . —— . Xy Xp... X 
function y = f(x, X2,..-, X,) is said to be symmetric, if for every permutation ( is, = _ ) 
of the variables x,, x2, .--, Xn, one has f(x, X2, +--+, Xn) =S (Xi, s Xi, 5 +> Xt.) 1 “ha In 
Examples of polynomial symmetric functions: 
l.y=x, + x, +---+ x,, hence, in particular, the case of the function z= x + y» already 
examined, ; 
2. ¥ = X,X2 ..- X,;, for example, the function z = xy already considered. 
3. ¥ = X1X2 + XyX3 7 X9X3. 
4. y = xf + X1x2 + 3. 
A special role is played by the elementary symmetric functions. They are given here in full for the 
case n = 4: 


By Vieta’s root theorem, these elementary symmetric functions give the coefficients, up to their 
signs, of the polynomial that has x,, x2,...,x, aS its roots. For example, for x* + ax? + bx? 
+ cx + d= 0 with roots x,, x2, x3 and x4: 
a = —04(X1,%2,%3,%4); b= +0(%1, X2, X3, X4)3 
c= —O3(X1, X2, X3, X4)5 d= +64(X1, X2,%X%3, X4)- 
The following theorem holds for the elementary symmetric functions: 
Every polynomial symmetric function with n independent variables can be expressed as a polynomial 
in the elementary symmetric functions 0, 02, ..., O. 


Instead of a proof, which would require fairly complicated explanations, here is a simple 
example, which will make the situation clear. The symmetric function f(x; , x2) = x? — x1x2 + x3 
can be expressed in the form g(0,, 62) = 0? — 302, as can be verified by substitution: ~ 


ao? = x? + 2x,x2 + x3 and 302 = 3x,x2 in fact give 0? — 30, = x? — x4x2. + x3. 
A corresponding theorem holds for rational functions. 


Every rational symmetric function can be expressed as the quotient of two polynomial symmetric 
functions. 


6.1. Percentages 139 


Finally, it should be noted that in cases in which symmetric functions can be represented geometric- 
ally, they also exhibit properties of geometrical symmetry. For example, the representations of the 
functions z = x + y and z = xy already considered both have the plane x = y as a plane of sym- 
metry. 


Homogeneous functions. A function with ” independent variables is called homogeneous of degree m 
if the function value is multiplied by +” when each individual independent variable is multiplied 
by r; this means that f(tx,, tx2, ..., (Xn) = t™f(x1, X2, ---, X,). Homogeneous polynomial functions 
are of special interest. A homogeneous polynomial function of degree m is often called a form of 
degree m. For m = 2, one speaks of a quadratic form, for m = 3, a cubic form. In the case m = 1 
such a ee is called a linear form. 


2 fee) = st inti ers « 
(xi, X30 X01 y= x, + x2 — x3 — a 
The following theorem holds for homogeneous polynomials: 


The product of non-zero homogeneous polynomials is again a homogeneous polynomial. Its degree 
is equal to the sum of the degrees of the individual polynomials. 


The proof of this theorem follows at once from the multiplication rule for polynomials. 

Homogeneous functions, in particular homogeneous polynomials, play a part in various branches 
of mathematics. For example, a determinant with n rows and n columns is a homogeneous function 
of degree n, with n? independent variables. Quadratic forms such as F(x, y) = Ax? + Bxy + Cy?, 
that is, homogeneous functions of the second degree having two independent variables, occur in 
the theory of quadratic number fields. From another point of view- certain quadratic forms will 
be investigated in analytic geometry. 


6. Percentages, interest and annuities 


Percentages <iccaseueieeiie donee 139 6.3. Compound interest .............. 141 
Simple interest .................. 140 6.4. Annuities 


Nr 
tN) — 


6.1. Percentages 


Percentage and actual value. In very many areas of everyday life one comes across the concept 
of percentage. It is stated, for example, by how many per cent in a certain time interval production 
has increased or net costs have been increased, or what percentage of the population is male or 
female. In all these statements a comparison iS made. The reference values, for example, the produc- 
tion or net costs at a particular point in time or the total population, are taken to be equal to 100, 
and the numbers to be compared, called actual values, for example, the production or net costs at 

Saal 10gal another point in time or the female popula- 

: tion, are referred to 100. The numerators of 

these fractions with denominator 100 are called 

percentages and denoted by %. If p is the per- 

centage, a the reference value and 6 the actual 
value, then b/a = p/100 or p = 100: b/a. 


| a - see |e S| 


If two vessels have capacities of 5 gal. and 
10 gal., respectively, but contain 3 gal. and 
4 gal. of liquid, then the 10 gal. vessel con- 
tains more liquid than the 5 gal. one, but it is 
less well utilized in proportion to its capacity, 
namely in the ratio 4: 10, compared with 3: 5 
for the 5 gal. vessel. If one calculates relative 
to the reference value 100, then 4/10 = x,/100, 
6.1-1 Utilization of capacity or x, = 4: 100/10 = 40, while 3/5 = x2/100, 


_AARARERERRRRR EEL RRARA RRR! 


77 "lB ih "ha "Bl, i 


ry 
[4 
¥] 
ie] 
cA 
3 
e 
5 
. 
e 
ra 
# 
ca 
i 
ca 
H 
ca 
a 
a 
Cd 
A 
# 
# 
3 
A 
3 
A 
# 
# 
] 
rd 
A 
a 
a 
A 
i 
A 


PALA ETAT EEE EE Ak 


Tn Ne ae ar he Sh De ee ha hs 


a ee eae a ae ae dal ll LLL LLL LEE ELE IEE EE EEE 


140 6. Percentages, interest and annuities 


or x2 = 3+ 100/5 = 60. Thus, 40% of the capacity of the 10 gal. vessel is utilized, and 60% of 
the 5 gal. vessel (Fig.). 

Example 1: Among 1500 employees in a factory there are 300 women. Thus, among 100 employ- 
ees there are on average 300/15 = 20 women. Although it is obvious from this that 20% of the 
workers in this factory are women, the result can be obtained formally by substitution in the for- 
mula derived. From the reference value a = 1500 and the actual value b = 300 the percentage p 
is given by p = 300- 100/1500 = 20; that is, 20% of the employees are women (Fig.). 

Example 2: How many pounds of tita- 
nium are contained in 275 lb of a steel 
alloy if it contains 4% of titanium? - 
Here the actual value b is required and 
the reference value a = 275 and the per- 
centage p= 4 are given. Thus, the above 
formula is to be solved for the actual 
value. This gives b = p-a/100 and for 
the numbers of the example 6 = 
4. ded 100 = 11. The steel alloy contains | 
11 Ib of titanium. 6.1-2 Composition of 

Example 3: The average annual milk — employees Wactrated by 
yield of 2800 kg per cow is raised by 8% areas 
in rt course cs a ro ston a re 

-a/100 one obtains b = 8- | rote : ee 
7° 224. The milk yield is increased by Raising of the milk yield 1 
224 kg to 3024 kg per cow (Fig.). 

Example 4: By improved planning the cost of transport of bricks in a quarter can be reduced 
by £ 4800 or 12°. Here the reference value a is not known, but can be calculated from a= b-100/p. 
One finds that a = 4800 - 100/12 = 40000. The cost of transport before was = 40000; now it Is 
£ 35200. 

Example 5: In one year 3600 articles were manufactured. Production has increased to 120% 
compared with the previous year. How many articles were produced the previous year? — From the 
actual value b = 3600 and the percentage p = 120 the reference value a can be calculated: 
a = 3600 - 100/120 = 3000. In the previous year 3000 items were produced. 


6.2. Simple interest 


In financial transactions it is customary to pay compensation for the loan of a sum of money; 

that depends on the amount of money borrowed and on the time and is called interest. For example, 

the National Savings Bank paid 1970 

- ‘plate | Amount of Deposit in Words | orpssite | | | wirmemwais | ions! interest to the public at the rate of 3.5% 

Wilkice Yoda Hawa —— a of the amount of money deposited per 

) Si tip Sete annum (Fig.). These deposits, however, 

do not lie idle; on the contrary the banks 

use them to provide short-term and 

long-term credits, for example, for large 

purchases. For these loan facilities the 
banks in their turn require interest. 


Rate of interest. The percentage rate 
of interest r states that for every £ 100 one 
obtains interest of £ r in one year. A sum 
of money, or principal, & P contains 
P/100 times £ 100, and thus earns in 1 year 
interest of & (P/100)-r and inz years £ J, 
where J = P-r-n/100. 


Example 1: How much interest is payable on a mortgage of £ 2000 for 5 years at 7°? — The 
principal is P= 2000, the rate of interest r= 7°, and the number of years n=5 are given. The 


6.3. Compound interest 141 


interest formula gives J = (2000-7 - 5)/100 = 700. The interest payable on the mortgage is £ 700 
in 5 years. 

Example 2: What is the rate of interest if a principal of £ 1 200 earns interest of 576 in 6 years? — 
In this case the principal P = 1200, the interest J = 576 and the number of years n = 6 are given. 
The interest formula gives for the rate of interest r = (100- /)/(P- 1), or with the values of this 
example, r = (100 - 576)/(1 200 - 6) = 8. Thus, the rate of interest is 8%. 


Example 3: What is the interest on a principal of £ 400 in 5 months at 9% and at 11%? - 
At 9% interest, J = a = 15.0. Atl1l%, I = moana — 18.33. Hence in 5 months 
£ 400 earns interest of £ 15.0 at 9% and £18.33 at 11%. 


6.3. Compound interest 


Savings banks add the interest accruing at the end of a year to the principal; hence in the following 
year interest is also calculated on this interest. This method of calculation is called compound interest. 
Annuities and repayments of loans are also calculated with compound interest. Annuities are sums 
of money paid at fixed intervals of time, for example, yearly. In insurance the payment depends 
either on reaching a fixed point in time, for example, for an old-age pension, or on the occurrence 
of particular circumstances, for example, for sickness and accident benefit. A principal Po yields 
interest of Po - r/100 in one year at r%. 

From the beginning of the second year the principal P, = Po + Po°: (r/100) = Po(1 + r/100) 
= PoR with R = (1 + r/100) will earn interest. 

In one year this increases by P,:-7r/100 to P, = P, + P,(r/100) = P,R = PoR?. The same 
argument holds for the 3rd, ..., nth year. At the end of the mth year the principal Po has grown by 
compound interest to the sum, called the amount, -————_,_..... |p —~ pp] 
P,, = PoR", where R = (1 + r/100) is the growth factor. Nite a ea tN fae UR lat 


Example: At 6% a principal of £ 1500 grows to the amount £ 2007.33 in 5 years, and interest 
for the individual years is shown in the following table. 


Interest infat | 
end of year 


Amount in £ at 
end of year 


| Principal in at 2 
beginning of year 


| 1500.00 90.00 1590.00 
2 1590.00 1685.40 
3 1685.40 1786.52 
4 1786.52 1893.71 
5 | 1893.71 2007.33 


If the interest had been simple instead of compound, the principal would have grown to the amount 
£1950 after 5 years. 


Figure 6.3-1 shows the growth of the principal Pp = 1 at simple interest and at compound 
interest if r = 6%. From the compound interest formula both the number of years n and the per- 
centage rate of interest r can be calculated. Taking logarithms of both sides of the formula one 
finds lg P, = lg Pp + alg R or n= (lg P, — lg Po)/Ig R; thus, one can calculate the number of 


years n. Taking the mth root of both sides one finds R = V(Px/Po) = 1+7/100, and from this 


142 6. Percentages, interest and annuities 


the percentage rate of interest is given by 
Al 
r= 100[/(P,/Po) — 1]. 


Example: In how many years does a principal of £ 500 
double itself at 10°% per annum compound interest? — 

Besides the principal Pp = 500 and the amount P, = 1000, 
the growth factor R = 1.10 is given. The number of years 
nis therefore given by n= (lg 1000 — lg 500)/Ig 1.10 = 7.27 ... 
After about 7 years the principal has doubled. 


Discount. For a known amount P, and growth factor R the 
initial principal Pj can be calculated from the compound interest 
formula: Po = P,/R" = P,- V", where V = 1/8. This process is 
known as discounting, and V as the discount factor. One says 
that the amount P, after m years is discounted to the present 
value. 


Example: A married couple intends to make a purchase to 
the value of £ 750 in 5 years’ time. A sum of money is to be T @ 20 30. 0 50 
paid into a savings bank today (at 9% interest) in order to ———e years | 
ensure that the full amount will be available after 5 years. Part | 

For the calculation of the required sum, the amount after 6.3-1 Growth of the principal Py=1 
5 years P,; = 750, the rate of interest r=9% and the number at simple interest and at compound 
of years n = 5 are known. From the table of discount factors interest of 6% In 50 years 
one reads off for m = 5 and r = 9 the value V* = 0.6499 and 
calculates the present value Py = 750: 0.6499 = 487.43. At the present time £487.43 must be 
paid into the savings bank. 


Interest factors 
Growth factor R” Discount factor V" 
Years Rate of interest Rate of interest Years 
n T% 9% 11% 1% 9% 11% n 
1 1 
Z 2 
3 3 
4 4 
5 5 
6 6 
7 7 
8 8 
9 9 
10 10 
11 11 
12 12 
13 13 
14 14 
15 15 


Compound interest tables. For calculating the amount P, or the present value Pp banks use so- 
called compound interest tables, in which the powers R" of the growth factor R and the powers V” 
of the discount factor V are given for different rates of interest r and different numbers of years n. 
The growth and discount of a sum Po with growth factor R and discount factor V can be represented 
graphically with the help of a time line (Fig.). 


6.4. Annuities 143 


Example: What is the rate of interest r if a sum of £ 400 grows 4rP,R* 
to £ 787 in 10 years? - i 0 
The principal Pp = 400, the amount P, = 787, and the num- 3- RR? 
ber of years n = 10 are known. The rate of interest can be cal- Fathes ale of © 
culated either with the help of compound interest tables, by cal- : S ons 
culating R'° = 787/400 = 1.9675 and reading off the rate of © iter § 
interest r= 7 in the table, or from the formula “s Fi S 
10 | — present — = 

r= 100-[ y(787/400) — 1] = 7. P ad Lae 
e-fev & 
= - 
past -2F Rv? = 
at 
6.3-2 Representation of the growth and discount of a sum Py, with the | “SPRY? E 


i 
off, 
oqo 
~. 
Find 


help of a time line 


6.4. Annuities 


The most important form of income from investment is the annuity, that is, a sequence of payments 
at previously determined points of time over a certain number of years. The individual payments 
of agreed amounts, called instalments, are usually paid at the end of each of the time intervals 
considered (in arrears) and seldom at the beginning (in advance). The amount of an annuity is the 
sum to which the instalments would accumulate at the end of the period if they were invested at 
r°% compound interest at once on being received. The present value is the sum of money that has 
to be paid at the beginning of the period if the amount of the annuity is to be secured by a single 
payment. 


Annuity payable in arrears. The instalments b payable at the end of each year accumulate at 
compound interest. After 1 years the first instalment has the value bR"—!, the 2nd the value bR"~?, 
and so on, and the last (6) has just been paid. Consequently the total amount s, is the sum of a 
geometric series: 


Sp = b+ bR+-»+ bR™ 
n-1 R"—1 

= . i — ————— 
ae as a 


An annuity of £ 1500 payable in arrears has the amount after 5 years at 11% of 

Ss = £1500- (1.11% + 1.115 4+ 1.117 + 1.111 + 1) = £ 9341.7. 
To calculate the present value a, the amount ss has to be discounted to the time zero; with the 
discount factor V = 1/1.11 = 0.9009, one obtanis a = s,V> = s, - 0.5935 = 5544. In general: 
a=5s,V" = [b(R" — 1))/[R"(R — 1)). 


Example I: An annuity of £2000 is to be paid in arrears for 11 years. What is the amount 
of the annuity at 11° compound interest? — For the growth factor R= 1.11 one reads off 
R'1 — 3.1518 from the table of interest factors. Thus, s,, = 2000+ (3.1518 — 1)/(1.11 — 1) 
= 39123.6. The amount at the end of the 11th year is £ 39123.6. , 

Example 2: What sum of money will secure the annuity of Example 1? — With the help of 
the table of interest factors one obtains a = s,, - V** = 39123.6- 0.3173 = 12414. The present 
value of the annuity is = 12414. 

Example 3: After how many years does an annuity of £ 1000 at 11% compound interest, pay- 
able in arrears, have the amount £ 18000? — From the formula s, = : . in me re "| one 

| 5A(R — : 

obtains s,(R — 1)/b = R" — 1 and R" = s,(R — 1)/b + 1, so that n = gf daha Lae) le Sa, 

With the numerical values s, = 18000, 6 = 1000 and R = 1.11 this gives nm = lg 2.98/Ig 1.11 
ew 10.463. The amount is reached after about 11 years. 

Annuity payable in advance. For an annuity payable in advance each instalment 6 will be subject 
to interest for one year longer. From the compound interest formula the amount §, of the annuity 
R"- 1 
R-1 


= §,. 


is therefore R times as much as the annuity payable in arrears: Rb- 


144 6. Percentages, interest and annuities 


Rt oe 


pee I: For the data of Example | in the previous section, but for an annuity payable 
in advance, the amount is given by §,, = 2000- 1.11 « (3.1518 — 1/d. 11 — 1) = 43 427.2. After 
11 years the amount is & 43.427.2. 


Example 2: The present value of the annuity of Example 1, payable in advance, is given by 
@ = 2000 - 0.3522 - (3.1518 — 1)/(1.11 — 1) = 13779. Thus, the sum of £13779 will purchase 
the annuity. 

Example 3: For the data of Example 3 above, but for an annuity payable in advance, 

Ig [((18000 - 0.11)/(1000 - 1.11) + 1] 
Se = 981. 
Ig 1.11 

The amount is reached after about 10 years. 


Repayment of a loan. The repayment of credits and mortgages, for example, for investments or 
new buildings, is usually arranged in such a way that a yearly fixed sum, the annual instalment, 
is paid, which does not vary for the duration of the loan. This is composed of two parts, the interest 
and the repayment of principal. The course of the repayments for a credit of 10000 at 7% compound 
sabi can be seen from the following repayment scheme, in which the annual instalment is fixed 
at & 1000. 


Year Debt at 
beginning 
of year 


Debt at 
end of year 


Interest 
payment 


Annual 
repayment 


Repayment 
of principal 


1 10000.00 1000.00 9700.00 
2 9700.00 1000.00 9379.00 
3 9379.00 1000.00 9035.53 
4 9035.53 1000.00 8 668.02 
5 


8 668.02 1000.00 8 274.78 
It is seen that as the period of the loan progresses, the interest payment decreases and the repayment 
of principal increases. 

This repayment scheme is clearly based on the following rules. At the end of the first year interest 
of S- (r/100) has to be paid on a loan S. Thus, out of the annual instalment A there remains a sum 
T, = A — S(r/100) for repayment of principal. At the end of the second year interest of S, - (r/100) 
is due for the smaller loan S, = S — T,; out of the annual instalment A there remains a sum 
T, = A— S,:(r/100) = A— S- (r/100) + 7, -(r/100) = 7, + 7, : (r/100) = 7,R for repayment 
of principal. A sum S, = S, — Ty, is still to be repaid. At the end of the year the interest is 
S2 * (r/100) and the repayment of principal 7; = A — S,° (r/100) = =A—S§,: (r/100) u T° (r/100) 
= T,R. At the end of the mth year the repayment of principal i is 7, == T,_, R= T,_2R? =:-- = T,R" 

In the example quoted the repayment of principal in the eleventh year is T= = § 300 - 1 0719 
= £ 590.16. 

The sum s, of the repayments of principal of the first m years corresponds to the amount after 
n years of an annuity payable in arrears, s, = 7, ‘ (R" — 1)/(R — 1). In the example, after 11 years 
$11 = £300- (1.071! — 1)/(1.07 — 1) = £ 4734.90. The loan will have been paid off when the 
total repayment of principal is equal to the loan: s, = S or T, :-(R" — 1)/((R—1)=S 


From this equation one can calculate the number of years 7 after which the loan will have been 
Ig{(S/Ts) (R—D +1) 
Ig R 

The debt is discharged after 18 years. 


discharged: n = ; in the example, 2 = 17.79, or 18 in round numbers. 


6.4. Annuities 145 


Life insurance. A further field of application of compound interest and annuity calculations 
consists of the different forms of life insurance. One distinguishes, among others, between death 
and accident insurance, sickness and old-age insurance and contributory pensions. In all these in- 
surances the insurance companies enter into a contract, called the insurance contract, with the insured 
person. Between the payments of the person taking out the insurance and the commitments of the 
insurance company, which depend, among other things, on the type of insurance, an equivalence 
must exist, so that the insurance contract does not lead to a loss for any of those concerned. Naturally 
the equivalence does not hold for the individual insured, in which case the insurance would be 
superfiuous, but only for the totality of all insured persons. Consequently, in the mathematical 
formulation of the equivalence principle, besides compound interest and annuity calculations, 
demographic assumptions play a role. 

Mortality tables. The most important demographic aid consists of the mortality tables. These 
tables, which are drawn up partly on the basis of population census and partly by experience over 
many years of the insurance company, start from a definite (but arbitrarily chosen) number /, of 
persons of the same age, namely n-year olds, and state how many of these reach their xth year. 
This number is denoted by /, and is called the number of survivors to the age x. In addition the 
following relations are contained in the table: /, — /,,, = d, is the number of those dying at the 
age x, p, = /,,,/l, is the probability of surviving of the x-year old, q, = d,/l, is the probability 


le @) 
of dying of the x-year old and e, = (1//,) Y /.4, — 1/2 is the mean life expectancy. 
k=0 


Figure 6.4-1 is a section of a typical general mortality table. One reads in this that of 100000 
males of the same age born say 50 years ago, 89104 are still alive, and that for each of these the 
mean life expectancy is 24.01 years. 


6.4-1 General mortality table 


com- of 100000 probability life expectancy of 100000 probability life expectancy 
pleted live-born of dying in years live-born of dying in years 
years at the same at the same 
of age time time 
of all of each | still of all of each 
those still] onestill| living those still] one still 
living living living living 
x e/,. Cx CY €x 
46 90 859 | 332 0.003 653 149 2 499 889 | 27.51 93 612 | 241 0.002 579 387 2 926 801 | 31.27 
47 90 527 | 345 0.003 812 489 2 409 196] 26.61 93 370 | 248 0.002 660 978 2 883 310] 30.35 
48 90 182 | 447 0.004 954 396 2 318 841 | 25.71 93 122 | 342 0.003 672 863 2 740 064 | 29.42 
49 89 736 | 632 0.007 041 835 2 228 882 | 24.84 92 780 | 468 0.005 044 604 2 647 114] 28.53 
50 89 104 | 801 0.008 988 733 2 139 462 | 24.01 92 312 | 543 0.005 883 410 2 554 568 | 27.67 
51 88 303 | 849 0.009 609 744 2 050 759 | 23.22 91768 | 547 0.005 959 474 2 462 528 | 26.83 
52 87 454 | 807 0.009 226 817 1 962 881 | 22.45 91222 | 525 0.005 753 766 2 371 033 | 25.99 
53 86 647 | 795 0.009 180 458 1 875 830| 21.65 90 697 | 522 0.005 758 713 2 280 074 | 25.14 
54 85 852 | 870 0.010 138 656 1 789 581 | 20.85 90 174 | 546 0.006 059 439 2 189 638 | 24.28 
55 84 981 | 990 0.011 647 731 1 704 164 | 20.05 89 628 | 586 0.006 536 162 2 099 737 | 23.43 


146 7. Plane geometry 


7. Plane geometry 


7.1. Points, lines, rays and segments ... 146 General polygons ........000c0008 162 
Points and lines ....... 0.0. ccueee 146 Regular convex n-gons ........... 162 
Rays and segments ...........006. 147 . 
Parallel and orthogonal lines ...... 148 77. steaight lines or figur “° bounded by 164 
7.2. Amples ........ cc ccc ccc c cece ees 148 Measurement of area ............ 164 
Classification of angles ........... 148 Mensuration of simple figures ..... 164 
Units of measurement of angles .... 149 Area theorems for a right-angled 
Angles between intersecting lines ... 150 triangle 0... ccc ccc eee cence 166 
Angles between two parallels and Transformation of areas .......... 167 
0 1 re 150 a 
. 7.8. Similarity ........... 0... c eee ee 168 
Constructions of angles ........... 151 The concept of similarity ......... 168 
7.3. Symmetry ..............c cee eeee 152 The intercept theorems ........... 169 
Axial symmetry ........00ceeeeee 152 Theorems on similarity ........... 170 
Central symmetry ......00cceeees 152 Division of a segment ............ 170 
Basic constructions .........+.+.+. 1337.9. Circles 0.00. .0.cecceeceeeeeesees 171 
7.4. Triangles ........ 0... cc cee eee 154 Notation 1.0... cece cence eens 171 
The parts of a triangle, classification Theorems on angles in circles ...... 172 
Of triangles 0.0... ccc ec cee eens 154 Theorems on tangents to a circle ... 172 
Basic facts about triangles ........ 155 Computations for a circle ......... 173 
Congruence of triangles .......... 156 Theorems on chords, secants and 
Transversals and distinguished points CTANGENES 2... cee ccc cece eens 175 
Of a triangle 0.0... cc cece eee 157 Quadrilaterals of chords and tangents 176 
7.5.  Quadrilaterals .................. 159 7.10. Geometric loci .................. 177 
Parallelograms «0.0.0.2, 460 71+ Planimetric treatment of conic 
. SECHIONS .......... eee eee eee 178 
plas eZ ad hk j ° id Taree teres sere es 14 The ellipse .. 0.0... cc ccc cece eee 178 
ues a CHOLES «60s e sree eee v eee The hyperbola ...........0c0.000s 181 
7.6. Polygoms ............0cc cece eees 162 The parabola ..... 0... ccc cece eens 183 


Plane geometry is that part of geometry (Greek, measuring the earth) which deals with two- 
dimensional figures. Although we live in a three-dimensional world, the study of plane geometry 
can deepen our insight into some of the properties of our surroundings. 

Just as the notion of number was abstracted from the visible world, so also the concepts at the 
basis of geometry were gained by a process of abstraction extending over many centuries. By ignoring 
inessential differences, for example, of mass, colour, form or surface texture, and disregarding 
further irregularities of real objects one arrived at spatial forms in three dimensions: length, breadth 
and height. Accordingly we say that a solid body has three dimensions, but a surface only two, a 
line, for example, an edge of intersection of two surfaces, one dimension, and finally a point, regarded 
as the intersection of two lines, has the dimension zero. 

In plane geometry a plane is always taken as given. Geometric investigations are, in general, 
carried out within this plane, but in individual cases it is advantageous to consider also Euclidean 
space (EUCLID, Greek mathematician, about 300 B. C.) as a basic geometric object containing the 
given plane. 


7.1. Points, lines, rays and segments 


Points and lines 


Points and lines (more accurately, straight lines) are the basic concepts of elementary plane 
geometry. Intuitively, a line is often defined as the path of a point that moves in a plane in such a 
way that it always takes the shortest route between any two of its positions and does not change 
direction; even in a more rigorous approach no definition of lines and points is given. But in modern 
mathematics the relationships between these two kinds of geometric objects are fixed by axioms 
(see Chapter 40.). 


Number of intersections of several lines. Two lines in a plane have at most one point in common 


unless they coincide (that is, have all points in common). Two lines in a plane that have no points 
in common are called parallel. 


7.1. Points, lines, rays and segments 147 


If three lines in a plane do not all contain a common point, and if not two of them are parallel 
or coincide, then there are exactly three points of intersection between pairs of these lines. 

Four lines of which no two coincide or are parallel and of which no three have a point in common 
have exactly six points of intersection (see Complete quadrilateral). 

If 7 lines in a plane are given such that no two coincide or are parallel and no three have a common 
point, then each line has m — 1 intersections with the others; since each intersection is counted 
twice, the total number of points of intersection is n(m — 1)/2. 


Number of lines through several points. There is exactly one line through any two distinct points. 
If three points are not collinear (that is, do not all lie on the same straight line), then there are three 
lines, each containing two of them. These three points, or any two of the three lines, or one of the 
lines and the point not on it completely determine the plane in which they lie. 

If n distinct points in a plane are given such that no three are collinear, then every point lies on 
a single line through each of the other m — 1. As each line is counted twice, there are n(m — 1)/2 
possible lines; thus, if m = 4, there are six lines (see Complete quadrangle). 


Pencils of lines. There are infinitely many lines through any point in the plane. The set of all 
lines in the plane going through a single given point is called a pencil of lines. The point that is 
common to all the lines is called the carrier of the pencil. An upper case letter in brackets such as 
(P) is used to denote the pencil with the carrier P. By analogy, the set of all lines in the plane parallel 
to a given line is called a pencil of parallels. 

If the lines of a pencil (P) or of a pencil of parallels are intersected by two straight lines /, and /, 
that do not belong to the pencil, then the lines of the pencil induce a perspective map of all the points 
of /, onto those of /,. 


Rays and segments 


Rays. A ray (or half-line) contains precisely those points of a line that lie on one side of a point O 
of that line; the point O is included in the ray. In other words, the ray contains those points of the 
line that can be reached by travelling along the line, starting at O, in a particular direction (without 
reversing!). The concept of a ray was obtained, like all mathematical concepts, by abstraction. It 
is intuitively easy to grasp if one thinks of a ray of light emitted by the sun, or a line of sight, which 
is straight and is bounded by the eye of the observer (one idealizes the sun or the eye to a single 
point). 


Segments. A segment (or interval) AB contains precisely those points of the line through A and B 
that lie between A and B; A and B themselves are included in the segment. The segment is the 
shortest path in the plane connecting its two end-points. If it is important to emphasize direction, 


the symbol AB is used to signify that the segment is directed from A to B. To avoid the somewhat 
cumbersome use of arrows it is agreed that the segment is directed from the first named point to 
the second, that is, the first named point is the initial point, and the second the terminal point, of 
a directed segment. A further convention is that a minus sign indicates the reversal of the direction 


of the segment. Thus, one can replace AB = BA by AB — — BA. The length |AB| of the segment 
AB is the distance between A and B. It is measured by comparing it with another segment, the 
unit segment. The length of the unit segment serves as the unit of measure for lengths. 

Units of length. The basic unit of length is the metre. It is now defined as 1650763.73 wavelengths 
of the orange line in the spectrum of the Krypton isotope 86, measured in vacuo. 

Two of the admissible multiples of the metre, the decametre (1 dm = 10 metres) and the hecto- 
metre (1 hm = 100 metres) have not found general use. 


= — ——— 


Units in U.K. and U.S.A. 
inch lin = 1” = 0.0254 m 
foot lft =1° = 12” = 0.30648 m 
yard lyd= 3’ = 0.9144m 
mile l statute mile = 1 mi = 5280’ 
= 1609.344 m 
l imperial nautical mile = 6080° 
= 1853.18] m= 1 meridian second 
1 US nautical mile = 1852 m 
Im sf 6.215+ 10-* statute miles 
= 1,094 yd = 3.281" = 39.37" 
| km @ 0.6215 statute miles 


148 7. Plane geometry 


Parallel and orthogonal lines 


Parallel lines. Two parallels have no point in common. If a straight line / is parallel to a line /’ 
(notation: /||/’), then / and /’ have the same direction. One can obtain /’ from / by moving all the 


points of / through a segment equal to PP’ (such a transformation is called a translation). This is 
the basis of the construction of parallels by ruler and set square (Fig.). If one connects a point of a 
line / to all the points of a parallel line /’ by straight line segments, then among those segments 
there is a shortest one. The length of this segment defines the distance d between the two parallel 
lines. It is the same whichever point of I is chosen, in other words, parallels do not come closer or 
move further apart. 


7.1-1 Construction ofa parallel /’ to / at a distance 


[A , a by shifting (translation) 


Pe 


7.1-2 The distance between parallel surfaces of 
a work-piece (measurement of thickness) 


The normal distance between the parallel rails of a railway line (the gauge) is 1.435 m (56!/, ins.); 
in the Soviet Union it is 1.524 m, and in Spain and Portugal it is 1.670 m. The parallel jaws of a 
pair of calipers can be moved to measure the breadth of objects with parallel sides (Fig.). 

Every line / has exactly one parallel I’ on each side at any given distance d. The construction of 
l’ with ruler and set-square is shown in the figure (see Fig. 7.1-1). If / is a line and P’ a point not 
on /, then there is exactly one line /’ through P’ parallel to /; it lies in the plane through P’ and J. 
This statement, the existence and uniqueness of a parallel through P’ is the parallel axiom (or postulate) 
of Euclidean geometry. 


Orthogonal lines. The distance between parallel lines / and /’ is measured along a segment 4A’ 
that meets / in A and /’ in A’ in pairs of equal angles. These angles are called right angles. Orthogonal 
lines are lines that intersect at right angles; they are also called perpendicular to one another. Ortho- 
gonality or parallelism are properties of two lines relative to one another; the lines need not be 
vertical or horizontal. 


7.2. Angles 


Two rays a and b with the same initial point S can be made to coincide by a rotation about S, 
which determines the angle between a and b [notation <{(a, b) or (a, b)]. 
An orientation of the plane is given by fixing the direction of the 
rotation. In mathematics the positive orientation is chosen as the one 
corresponding to an anti-clockwise rotation; in surveying the clockwise 
rotation is taken to be positive. If the direction of rotation is im- 
portant, one must distinguish between the angles <{(a,b) and s 
(6, a): L(a, b) = —XL(b, a). If A is a point on a and B is a point on 
b, the angle <{(a, b) is usually written as ASB. S is called the vertex 7.2-1 Definition of an angle 
of the angle, the rays are called its arms. Each arm defines a 
particular direction and the angle is a measure of the difference of these directions in the oriented 
plane (Fig.). 


Classification of angles 


Angles are classified by the amount of the difference of the directions of their arms. If the angle 
corresponds to a rotation through a quarter of the full circle, it is called a right angle; a straight 
angle corresponds to a rotation through half of a full circle. If the inclination of the arms is less 
than a right angle, the angle is called acute; if the inclination is greater than a right angle but less 
than a straight angle, the angle is called obtuse; if the inclination is greater than a straight angle, 
the angle is called reflex; a reflex angle whose arms coincide is a full angle. 


7.2. Angles 149 


7.2-2 Angles 
- a a (A (@- 
acute angle right angle obtuse angle straightangle _—reflex angle Full angle 


Units of measurement of angles 


All methods of measuring angles are based on division of the circle (Fig.). There are two types 
of unit, based on measurement by degrees and by arc length, respectively. 


Degrees. If a circle is divided into 360 equal parts by radii, the angle between two neighbouring 
radii is called a degree (notation: 1°). Thus, a degree is one 360th of a full angle or one 90th of a 
right angle. The degree is divided into sixty minutes (notation: 1’) and the minute into sixty seconds 
(notation: 1°’). Although the subdivisions of a degree have 
the same names as those of the hour, it is important that 
the notation should distinguish between units of angle and 
units of time. 


In geodesy there is also a system of grades, where the 
circle is divided into four hundred parts and the right 
angle has 100 parts. Each of these is one gon or one grade 
and is divided decimally (notation: 1 gon). 


7.2-3 Diveracter with faces scale; 
«a = 57°, a = 43.7 mm 
For a right angle the following equations hold: 


1 right angle = 90° = 5400’ = 324000” = 100.0000 gon. 


Example I]: To transform 62° 48’ 15” into gons. 
48’ = 48°/60 = 0.8°; 15” = 15°/3600 = 0.004167", 
62° 48’ 15” = 62.804167 - 100 gon/90 = 69.7824 gon. 


Example 2: To transform 135.4682 gon into degrees, minutes and seconds. 
135.4682 gon = 100 gon + 35.4682 gon = 90° + 35.4681 - 90°/100 = 121.921 38", 
0.92138° = 0.92138 - 60’ = 55.2828’; 

0.2828’ = 0.2828’ - 60” = 16.968", 
135.4682 gon ~ 121° 55°17”. 


Arc units. In a circle the length of an arc a between two radii is proportional to the angle between 
them to their length. The following ratios hold if the angle is measured in oe 


From this it follows that ratio between the lengths of the arc and the radius depends only on the 
angle at the centre subtended by the arc 


air = («/360) - 2% = 22a/360 = (2/180) - &. 
Thus, if the radius of a circle is known, the length of an arc on the circumference can be used to 
measure the corresponding angle at the centre. The arc of an angle is therefore defined as the quotient 


alr = (m/180)« = & = arca; a=r-°& = (2/180) a-r. 
The unit of this measurement is called a radian (Notation: 1 rad). 1 rad is the angle at the centre 
subtended by an arc of length equal to the radius of the circle: 
1 rad = 57.29578° = 57°17’ 44.8” = 57°17’ 45” (Fig.). air = & = arca a = &+ 180°/x 


150 7. Plane geometry 


The unit circle is the circle with radius equal to one unit of length. Thus, 
for the unit circle the measure of an angle in radians is the length of the 7.2-4 1 rad is the unit 


corresponding arc on the circumference. of radian measure 
Survey of angles and measures 
acute right obtuse straight reflex full 
angles angle angles angle angles angle 
degrees 30° 45° §7° 17 45” 90° 180° “e70° 360° 
gons 33'/,g0n 50 gon 63.6620 gon 100 gon 200 gon 300gon 400 gon 
radians t/6 m/'4 ] {2 m 32/2 2n 


Angles between intersecting lines 
There are four angles between two intersecting lines of a plane. One distinguishes between adjacent 
angles and opposite angles. 


Adjacent angles. Angles between two intersecting lines having their vertex S and one arm in 
common are called adjacent. The other arms are the two 
opposite rays from S along one of the lines, so that the angles 
combine to a straight angle; for example, in the figure 
ao+BP=Bt+y=yt+6é=6+ a = 180°. Two adjacent angles 
(for example « and #) need not be equal. If they are, then each 
must be half of 180°, that is, a right angle. This fact was 
exploited in the definition of orthogonal lines and is the basis of 
the following definition. 


A right angle is any one of the four angles between two  7.2-5 Adjacent angles, for exam- 


lines that intersect so that adjacent angles between them are Pile, « and 8, and vertically oppo- 
equal, site angles, for example, « and y 


Any two angles whose sum is 180°, whether they are adjacent or not, are called supplementary. 
Angles whose sum is 90° are called complementary. Adjacent angles are supplementary, but sup- 
plementary angles are adjacent only if they have an arm in common. 


Vertically opposite angles. Two angles between intersecting lines with a common vertex but no 
common arm are called vertically opposite, or simply opposite if no confusion is to be feared. Since 
two opposite angles have no arm in common, the directions of the arms of one must be opposite 
to those of the other. In the figure « and y, and # and o are opposite angles. 


Opposite angles are equal, because the sum of each with a common adjacent angle is 180°. 


Angles between two parallels and a line 
If a pair of parallels is intersected by a third line (Fig. )s they form eight angles, which fall into two 
sets of four equal angles each, for example, « = y = o’ = y’ andB=d= ff = 


7.2-6 Angles 

between parallels , ly ly 
and a straight line / | 
intersecting them ly 7 ly 


corresponding angles 


alternate angles 


ly ly 
7.2-7 Examples of [ [ 
angles between / / 
parallels and a line é é 


intersecting them pair of exterior and interior angles 


7.2 Angles 151 


Pairs of angles with a common vertex are opposite or adjacent. They are opposite if their arms are 
opposed, for example « = y, or B’ = 0’. They are adjacent if they have one arm in common and their 
other arms are opposed, for example, « + 6 = 180°, or y + 6 = 180°. 

Pairs of angles with distinct vertices are classified as follows (Fig.): 


1. Corresponding angles: the arms of one are directed in the same sense as those of the other, 
for example, « = «’,ory=y’. 

2. Alternate angles: the arms of one are directed in the opposite sense to those of the other; for 
example, « = y’, ory = &’. 

3. Interior and exterior angles: the arms on the parallels are directed in the same sense and those 
on the intersecting lines are opposite. If these arms contain the segment between the parallels, the 
angles are interior, otherwise exterior; for example, « + 4’ = 180°, exterior angles; and y + f’ = 180°, 
interior angles. 


If two parallel lines are intersected by a third line, then corresponding angles are equal, alternate 
angles are equal, and pairs of interior or exterior angles are supplementary. 


These theorems have converses, which state that if two lines are intersected by a third and if two 
corresponding or alternate angles are equal, or two interior or exterior angles are supplementary, 
then the two lines are parallel. 

The relative position of the three classes of angles can be described even without assuming that 
the two lines are parallel. Alternate angles are then defined as lying on opposite sides of the inter- 
secting line, and either both between the two lines or both outside them. The other types lie on the 
same side of the intersecting line. Exterior angles are both outside the two lines, interior angles both 
between them, and of two corresponding angles one is inside and one is outside. 


Constructions of angles 


Set squares. The usual models of set squares have a right angle and angles of 45° or 60° and 30°. 
Other angles can be constructed by adding or subtracting these angles (Fig.). Bisecting with a ruler 
(or set square) and compass gives further angles, for example 22.5°; 15°; 7.5°; and the angles ob- 
tained by adding or subtracting these new angles. In adding or subtracting angles it is sometimes 
necessary to shift the position of a previously constructed angle. A method of accomplishing this is 
described in the following paragraph. 


7.2-8 Construction of special angles 9 
, > \ i i with set-squares, 7, = 75°, 9, = 15°, 


Application of angles. It is always possible to draw an angle equal to a given angle, but in a dif- 
ferent position, by using ruler and compass only. Suppose, for example, that it is required to apply 
an angle « to a directed line g at a vertex P (Fig.). For the construction one draws two circles of the 
same radius about P (on g) and the vertex S of «. The circles intersect the arms of « at A and B 
and the line g at A’. The arc around A’ with radius |AB| intersects the circle at B’. The ray from P 
to B’ is the free arm of the angle « that has been applied to g at P. If « is not given by a drawing 
but only by a measurement, say « = 52°, one uses a protractor. 


7.2-9 Application of an angle 


Construction of angles by ruler and compass. Only exceptional angles can be constructed by ruler 
and compass, among them 120°, 90° and 72° which result from the ruler and compass construction 
of the equilateral triangle square, and regular pentagon (see Regular convex n-gons). Continued 
bisections of these lead to 60°, 30°, 15°, 45°, 36°, 18° and 9° (to mention only angles of a whole 
number of degrees). By adding 15° and 9° one obtains 24°, and hence 12°, 6° and 3°. Thus, all 
multiples of 3° can be constructed by ruler and compass. In fact, they are the only constructible 
angles of a whole number of degrees. 


152 7. Plane geometry 


7.3. Symmetry 


Axial symmetry 

Every plane E is divided into two half-planes by any straight line s in E (Fig.). A rotation in space 
of 180° about s maps each of these half-planes onto the other. Every point S of the axis s is its own 
image, S = S’, the axis is a fixed line. Furthermore, any line AS forms with s the same angle as its 
image A’S, and AS and A’S are equal in length. The line AA’ connecting a point to its image is 
perpendicular to s and is bisected by s. Every figure F is congruent to its image F’. 

Reflecting the half-plane in a mirror perpendicular to E and intersecting it in s has the same effect 
as the operation just discussed. F and its image F’ are therefore called mirror images or reflections 
of each other, and the map can be called a reflection in a line. The technical term is, however, axial 
symmetry. It is a geometric map, or transformation, and is uniquely determined by any point P and 
its image P’ (if distinct from P) or by the axis of symmetry s. 


at) 
c C 


7.3-1 Axial symmetry 


7.3-2 Two refiections can be replaced by 
a shift or a rotation 


If P and P’ are given, one can find the points S; on the symmetry axis using the fact that | P.S;|=|P’S,|, 
by intersecting circles of equal radius about P and P’. On the other hand, if s is given, then the image 
of an arbitrary point P has the same distance from s as P’ and PP’ is perpendicular to s. 

In the figure, though the mirror image F’ of F with respect to the axis s, is congruent to F, if E 
is oriented, then it has the opposite orientation to F. The mirror images F” and F’”’ of F’ with respect 
to further axes s2 and s3 have the same orientation as F. While F and F’ retain their opposite orienta- 
tion under any transformation of the plane, F and F” and F and F’”’ are congruent and equally 
oriented. In particular, if s, || s3, then F’’’ can be obtained from F by a translation or shift; if the axes 
S, and s2 intersect, then F’” can be obtained from F by a rotation of the plane. 


Axially symmetric figures. If certain segments or points of a figure lie on the line s that is chosen 
as the axis of symmetry, then the figure and its mirror image together create an axially symmetric 
figure, that is, a figure consisting of two parts that are symmetric to each other (Fig.). 


7.3-3 Figures with 
d an axis of symmetry’ 


Central symmetry 

Apart from rotating a plane by 180° through space about a line, one can also consider a rotation 
of 180° in the plane about a point. Figures that can be made to coincide in this way are called cen- 
trally symmetric to each other, and the point is called the centre of symmetry (Fig.). 

Under a central symmetry every point B of the plane is mapped to a point B’ such that the segment 
between them has the centre of symmetry S as its mid-point. This transformation is also called 
reflection in a point. Like every other rotation, it is a congruence transformation, that is, the size 
and shape of figures are not changed by the transformation. In contrast to axial symmetry, a point 


7.3. Symmetry 153 


7.3-4 Central symmetry 


7.3-5 Radial symmetry of regular polygons 


symmetry preserves the orientation of figures, so that a figure and its image under a point symmetry 
are congruent and equally oriented. The images under two consecutive reflections in the same point 
are coincident with the original figure. The only fixed point under a central symmetry is the centre of 
symmetry. 

Figures that can be made to coincide with themselves by a rotation through an angle @ about 
a point P are called radially symmetric. Central symmetry is the special case of radial symmetry 
corresponding to y = 180°. All regular polygons are radially symmetric (Fig.). 


Basic constructions 


Construction of the mid-point of a segment. This is effected by constructing the axis of the sym- 


metry that carries the end point A of AB into B. 

About each of the points A and B as centres draw circles of equal radius greater than half the 
length of AB. These intersect in points S,; and S, on the symmetry axis. Therefore the line s through 
S, and S, intersects AB in its mid-point and is perpendicular to AB; it is called the perpendicular 


bisector of AB (Fig.). 
5 “hs, 


“ss, Pal Pa 
xs / os 
‘ ral % r 
oo ul Pa h, a / 
# “ 7 ral ‘ 
ral Le ral i, 


7.3-7 Bisecting an angle 


\ Construction of the bisector of an angle. Angles are also 
fy bisected by using the properties of axially symmetric points. 
An arc of arbitrary radius is drawn about the vertex S of 
the angle <{(a, b). It intersects the arms of the angle in A and 
B. The perpendicular bisector of AB is the symmetry axis of 
the figure and bisects the angle. Since S is on this axis, it is 
~ only necessary to construct one further point on it. Such a 
PK > point, for example S2, is found by intersecting two arcs of 
equal radius about A and B (Fig.). 

Unlike bisection, trisection of an angle with ruler and 
compass alone is possible only in exceptional cases. 


7.3-6 Bisecting a segment 


Construction of perpendiculars. To erect the perpendicular 

- - on a line / in a point P on that line, draw a circle with P 

Qo as centre of arbitrary radius r (Fig.). It intersects / in A and 

B. One draws circles about A and B with equal radii greater 

7.3-8 Erecting a perpendicular to a than r. One of their points of intersection, say S;, is con- 
line in a point on it nected to P. The line S,P is the required perpendicular. 


154 7. Plane geometry 


If P is the mid-point of a segment AB, the perpendicular in P to AB can be constructed by the 
method given above for bisecting a segment. 


To drop a perpendicular. To drop the perpendicular to a line / from a point P not on / one draws 
a circle with P as centre of sufficiently large radius. This intersects / in A and B. The perpendicular 
bisector of AB is the required line (Fig.). 


P 7.3-9 Dropping a perpendicular to a line through a point not on it 


#] 


\ A 


7.3-10 Drawing a parallel to a line 


Construction of parallels. To draw a parallel to a line / with ruler and compass, one first erects 
the perpendiculars to / in two points A and B on /. On these one marks two points A’ and B’ equi- 
distant from | and on the same side of /. The line through A’ and B’ is parallel to / (Fig.). 


Ruler and compass constructions in general. Euclid founded his plane geometry on a system of 
axioms which ensure that it is always possible (1) to draw a line through any two given points, 
and (2) to draw a circle whose radius is the distance between two given points and whose centre is 
a given point. 

Thus, his theorems were always proved by a technique that reduces their statements to axioms 
or basic theorems on the intersections of lines with lines or circles, or of circles with other circles. 
Consequently, the only means admitted for construction in Euclidean geometry are those for drawing 
straight lines (ruler), and circles (compass). The requirement for a construction to be made with 
ruler and compass alone is therefore connected with the choice of axioms for plane geometry, and 
not with the accuracy of the result. Indeed, the result is frequently more accurate if the construction 
is made by other means. 

Furthermore, since the ancient Greeks had only rudimentary computational and algebraic 
techniques at their disposal, they tried to solve all major mathematical problems by constructions 
with ruler and compass. For instance, they found square roots by constructing the geometric mean 
of two segments. 

Three famous problems, in particular, proved intractable by this method: 

the trisection of an arbitrary angle, 
the squaring of the circle: the construction of a square whose area is equal to that of a given 
circle, and 


the doubling of the cube (the Delic problem): to construct a cube with double the volume 
of a given cube. 


Modern methods have proved that all three problems cannot be solved by ruler and compass 
alone (see Chapter 16.2. — Galois theory). 


7.4. Triangles 


The parts of a triangle, classification of triangles 


If three points in a plane, not ona single straight line, are 
given, then there are exactly three lines joining them. The 
closed figure formed by these lines (or rather the segments 
between the points) is called a triangle. The points are the 
three vertices of the triangle, the segments between them are 
its sides. A triangle is convex: this means that with any 
two points it also contains the segment between them. The 
vertices are usually labelled A, B and C, the sides with the 
lower case letters corresponding to the opposite vertex, 
that is, |AB| by c, |BC| by a, and |CA| by 6. Any two sides 7.4-1 The triangle ABC 


7.4. Triangles 155 


of a triangle form the arms of an interior angle of the triangle. The interior angles are labelled by a 
sequence of vertices, or by the lower case Greek letter corresponding to their vertex (Fig.). Thus, 


{CAB =a, {ABC = 8, XBCA =y. 


If a side is extended, the angle its extension makes with the following side is called an exterior 
angle of the triangle. Thus, «’ is the angle between CA produced beyond A and AB, f’ is the angle 
between AB produced beyond B and BC, and y’ is the angle between BC produced beyond C and 
CA. The whole 1 triangle 1S written /\ ABC. The triangle is called positively oriented if the direction 
of rotation is AB— BC > CA. 

A triangle is called isosceles if two sides (a) are equal, the third side is called the base (c), the 
opposite vertex is called the apex. In an equilateral triangle all three sides are equal. 

A triangle is called acute if all interior angles are acute, right-angled if one interior angle is a right 
angle, and obtuse if one is obtuse. In a right-angled triangle the side opposite the right angle is called 
the hypotenuse (Fig.). 


damm hb 


acute right-angled obtuse isosceles equilateral 


7.4-2. Types of triangles 


Basic facts about triangles 


Relations between the sides. From any vertex it is possible to reach any other vertex by traversing 
the sides in two different ways: either by going along the connecting side or along the other two, 
via the third vertex (for example, from A to B along c or along b and a via C). Since the straight 
line is the shortest distance between two points this implies thatc < a+ b,b<a+c,anda <b+c, 
from which six further inequalities can be deduced by subtraction: c—a< b or a—c< Bb, 
ec— b<aorb—c<a,and b—a< cora— b< c. But only three of these make sense geomet- 
rically, since the difference between two segments must have non-negative length. 


In a triangle the sum of any two sides is greater than the third side. In a triangle any side is greater 
than the difference between the other two sides. 


For instance, it is possible to construct a triangle with sides of lengths 3, 4 and 5 units u; but there 
is no triangle with sides measuring 3u, 4u and 8u, because 3u + 4u < 8u, and thus the sum of two 
sides would be less than the third. 


Relations between the angles. If a parallel g to a side is drawn through the opposite vertex (for 
example, a parallel to c=|AB| through C), then this gives a straight angle at C which is subdivided 
by two sides of the triangle. The two parallels g and c are inter- 
sected by a and b. Thus, the alternate angles are equal: 6 = « and 
e= B.Nowd+y-+ ¢€ = 180° and soa + B + y= 180° (Fig.). 

The sum of the interior angles of a triangle is 180°. 
Since every exterior angle is adjacent to an interior angle the 
following equations hold (Fig.): 
ao+oa’=180°", B+ fp =180°", y+’ = 180°. 
Adding both sides of the equations one obtains 
(o+B+y)+(%'+ B+ y’) = 540°. 
Since « + 6B + y = 180°, it follows that «’ + B’ + y’ = 360°. 
The sum of the exterior angles of a triangle is 360°. 


7.4-3 «a+ B+ ¥ = 180° 


Now every exterior angle is supplementary to the adjacent interior angle. But it has been proved 
that the sum of the interior angles is also 180°. Hence: 

Every exterior angle is equal to the sum of the two opposite interior angles: x’ = B+-y, B’ =y +a, 
y=atB 


156 7. Plane geometry 


7.4-4 Angles with pairwise 
orthogonal arms 


From these theorems it is easy to deduce the following statement, which is particularly useful in 
applications to physics (Fig.). 
Angles with pairwise orthogonal arms are equal if the vertex of one does not lie inside the arms 
of the other, nor on one of them. If the vertex lies inside them or on one of them, then the angles 
are supplementary. 


Relations between angles and sides. Let ABC be a triangle in which a> b. Let the bisector Wy 
of the angle y intersect |AB|=c in the point 
D. If the small triangle ADC is reflected in 
w, then the image of 5b is part of a, so_ hat 
the image A’ of A lies between B and C 
(Fig.). A’ is the vertex of the angle CA’D = a’, 
which is equal to « = {CAD (by refiecting 
in w,,). 

Now «’ is an exterior angle of DBA’ and so 
is equal to the sum of B and <BDA’, and, in 
particular «’ is greater than B. Sincex=«a' , 
and hence «>, this proves that a>b oo 
implies that « > B. The converse can also be 7.4-5 In the triangle ABC a> b implies that «> 8 
proved: if « > B, then a > b. 


In a triangle the angle opposite the longer of two sides is greater than the angle opposite the other. 
The side opposite the greater of two angles is longer than the side opposite the other. Angles opposite 
equal sides are equal and vice versa. Every isosceles triangle is symmetrical. The perpendicular from 
the apex to the base bisects the base and the angle at the apex. The angles at the base are equal. 

In a right-angled triangle the acute angles are complementary. In an isosceles right-angled triangle, 
the base angles are 45°. 

In an equilaterial triangle the interior angles are all equal; each is 60°. 

Equilateral triangles have three axes of symmetry. 

If one of the angles of a right-angled triangle is 30°, then the opposite side is half the hypotenuse. 


The last theorem is a consequence of the symmetries of the equilaterial triangle and is frequently 
used; for instance, set-squares are usually right-angled triangles of this type or isosceles right-angled 
triangles. 


Congruence of triangles 


General remarks. Plane figures are called congruent if they are of the same shape and the same 
size. Congruent figures can be carried into each other by a transformation that moves points, but 
does not change incidence relations (between points and lines), angles between lines, and lengths 
of segments. Such a transformation also preserves areas and leaves parallel lines parallel. If congruent 
figures have the same orientation (with respect to some fixed orientation of the plane), they can be 
transformed into each other by a sequence of translations and rotations of the plane. Such figures 
are called directly congruent. If they do not have the same orientation, then a sequence can be found 
taking one into the other, which apart from successive translations and rotations has a single refiec- 
tion in a straight line. Such figures are called inversely congruent. Translations, rotations, and reflec- 
tions are called congruence transformations and can be used as criteria of congruences in the in- 
vestigation of plane figures; but this by no means exhausts their usefulness as a tool of discovering 
new geometrical facts. 


Four theorems on congruence of triangles. In the definition of congruence it is required that the 
figures agree in all aspects, in particular, that the lengths of corresponding sides and the angles 
between them are equal. The theorems of this section state that for triangles in certain cases it is 
sufficient to test three parts as a check for congruence — if they are equal for two triangles, then the 
triangles are congruent. Here are the theorems. 


7.4, Triangles 157 


1. Two triangles are congruent if the length of a side of one is equal to the length of the correspond- 
ing side of the other and two angles of one are equal to the corresponding angles of the other («, s, «). 

2. Two triangles are congruent if the lengths of two sides of one are equal to the lengths of the 
corresponding sides of the other and the angles between these sides are equal (s, «, 5). 

3. Two triangles are congruent if the lengths of two sides of one are equal to the lengths of the 
corresponding sides of the other and the angles opposite the larger sides are equal (s, s, «). 

4. Two triangles are congruent if the lengths of the three sides of one are equal to the lengths of 
the corresponding sides of the other (s, s, s). 


If one tries to construct triangles with three 
given sides and angles, one sees that if these 
correspond to one of the theorems, and only 
then — the triangle is uniquely determined. On 
the other hand, if two sides and the angle 
opposite the smaller side are given, say a = 3 
units, c = 5 units, and « = 20°, then the figure 
shows that there are two possible triangles 
with these measurements. For if a line is 
drawn at 20° to the segment |AB|=c and an 
arc is drawn about B with radius a, then this 
arc intersects the line in two points C’ and 
C’’. Both the triangles ABC’ and ABC” satisfy 7.4-6 These triangles agree in three data, but are 
the requirements of the construction. Again, not congruent 
if « = 80°, then the arc of radius a does not 
intersect the free arm of « at all, and there is 
no triangle that fits the requirements (Fig.). 

In constructions using the first congruence 
theorem it is convenient first to obtain the 
two angles adjoining the given side. If one of 
them is not given in the data, it can be found 
by the theorem on the sum of the interior 
angles of a triangle. For instance, ifc,« andy © = 2 
are given, B = 180° —a«—vy can be com- @ = 2cm 
puted. Then the triangle can be constructed 
simply by drawing lines at the appropriate 
angles through the end-points of the given 
segment. Alternatively, the computation can be avoided by drawing a line under angle y to the 
free arm of « in an arbitrary point C’. The parallel to this line through B is then the third side of 
the triangle ABC. 


Transversals and distinguished points of a triangle 


7.4-7 No triangle with these data can be constructed 


A transversal is any line that intersects the triangle. 


Perpendicular bisectors. The perpendicular bisector of a side is the line that intersects the side in 
its mid-point and makes a right angle with the side. 


The perpendicular bisectors of the sides of a triangle intersect in a single point M, the circum- 
centre of the triangle. 


The points on the perpendicular bisector of a segment are the points that are equidistant from its 
end points. Hence the intersection of two perpendicular bisectors, say m, and m,, is equidistant 
from the end-points of the two sides 
BC and CA, that is, from the three 
vertices of the triangle. But then it 
must also lie on the third perpendi- 
cular bisector m,. In fact, M is the 
centre of the circle through the 
vertices of the triangle. This circle 
C is called the circumscribed circle or 

circumcircle of the triangle and M 

is called the circumcentre (Fig.). The 

radius of the circumcircle is the 
distance from M to any of the verti- 
ces of the triangle. In acute triangles 
7.4-8 Perpendicular bisectors and circumcircle the circumcentre lies inside the tri- 


158 7. Plane geometry 


angle the circumcentre of an obtuse triangle lies out- 
side the triangle, and the circumcentre of a right- 
angled triangle lies on the hypotenuse. Thus, the hypo- 
tenuse is the diameter of the circumcircle, and the 
vertices C of all right-angled triangles with hypotenuse 
AB must lie on the circle with AB as diameter. The 
discovery of this fact is attributed to THALEs of Miletus 
(about 624-547 B. C.), and the circle is sometimes called 
Thales’ circle on his honour (Fig.). 


Theorem of Thales. The locus of the vertices C, of 


: a all right-angles whose arms go through to fixed points 
hee -auaeentean eh tnelss A and B is the circle with AB as diameter. 


The circumcentre of an isosceles triangle lies on the axis of symmetry, which is the perpendicular 
bisector of the base. 

The mid-points A’, B’ and C’ of the sides of a triangle form the vertices of a smaller triangle lying 
inside the first (Fig.). The sides of the smaller triangle are parallel to the sides of the original. Thus, 
the perpendicular bisectors of the sides of ABC are also perpendicular to the sides of A’B’C’, and 
they go through the vertices of A’B’C’. They are the altitudes of A’B’C’. 


Fy 
Hi 


7.4-10 Perpendicular bisectors and altitudes 7.4-11 The altitudes of a triangle 


Altitudes. The lines through the vertices of a triangle perpendicular to the opposite sides (or 
their extensions) are called the altitudes of the triangle. In the figures their lengths are marked by 
h,, h, and hg. 


The three altitudes of a triangle intersect in a single point, the orthocentre of the triangle. 


The orthocentre lies inside an acute triangle, outside an obtuse triangle, and in a right-angled 
triangle it is the vertex of the right angle. In an isosceles triangle the altitude through the apex is 
also the perpendicular bisector of the base; the two transversals both coincide with the axis of 
symmetry. 


The orthocentre can be used to solve the following problem. Using only a restricted area of the 
plane, say a sheet of paper, to find the line through a given point H and the intersection C of two 
non-parallel lines /,; and /, that do not intersect in the permitted area. 
The method is to drop perpendiculars from H to /, and /,; each of 
these intersects the other line, say in the points A and B. The per- 
pendiculars are altitudes of the triangle ABC. Therefore, the third 
altitude must go through C and dH, the orthocentre of ABC. This 
third altitude is the required line. The construction is completed by 
drawing the perpendicular c through H to AB (Fig.). 


I 
/ 
I 

! 


Medians. The medians of a triangle connect the mid-points of 
the sides to the opposite vertices. In the figures their lengths are 
marked by s,, 5, and s,. 


The three medians of a triangle intersect in a single point, the 
centre of gravity of the triangle. Every median is divided in the 
ratio 2:1 by the centre of gravity, the longer part adjoining the 
vertex. 7.4-12 Construction of a line 

through a point A and the 
To prove this the two medians AD and BE have been drawn in _ inaccessible intersection of the 
the figure; their intersection is S. The lines ED and AB intersect non-parallel lines /, and /, 


7.5. Quadrilaterals 159 


CA and CB and AD and EB. Since |CB|:|CD| = |CA|: |CE| = 2:1, it can be proved (using 
the intercept theorems, or rather their valid converses) that AB and ED are parallel and that 
the ratio of their lenghts is 2:1. Therefore |SA|:|SD| = |SB|:|SE| = |AB|: |ED| = 2:1. The 
same observations can be made with a different pair of medians, say CF and BE, and their point 
of intersection must be S again, since there is only one point that divides |BE| in a ratio 2: 1. 


7.4-13. The three medians of a triangle 
intersect in a common point S 


Angle bisectors. The bisectors of the angles 
of a triangle are marked w,, Wg, and w,, in 
the figures. 


The three angle bisectors of a triangle 


intersect in a single point M, the incentre of 
the triangle. 


The intersection M of w, and wg is equi- 
distant from b and c (since w, bisects «) and 
from c and a (since wg bisects B). So it is 
equidistant from a and b and hence must lie 
on w,. This distance from the sides is the 
radius of the inscribed circle or incircle of 
the triangle, which touches each side of the 
triangle in a single point. M is the incentre 
of the triangle. The points D, E and F in 
which the incircle touches the sides are the intersections of the perpendiculars from M to those sides 
(Fig.). The bisectors of the external angles (or their opposite angles) intersect in the three centres M,; 
M,, and M, of the three ecircles (or escribed circles) of the triangle. An ecircle touches one side and 
the extensions of the other two sides in a single point each. In an isosceles triangle the bisector of 


the angle at the apex coincides with the median, the perpendicular bisector of the base, and its 
altitude. 


7.5. Quadrilaterals 
Generalities 


Sum of the interior angles of a quadrilateral. In projective geometry four points in a plane no 
three of which are collinear define a complete quadrangle, which has 6 sides. But in ordinary plane 
geometry a quadrangle has only 4 sides which depend on the sequence in which the points are given. 


7.4-14 The three escribed circles of a triangle 


e/A 
_ } 


7.5-1 A convex quadrilateral 7.5-2 Concave quadrilateral (right) and reflex quadrilateral (left) 


160 7. Plane geometry 


If the points are A, B, C and D, then the segments |AB| = a, |BC| = 6, |CD| = c, and |DA| = d, 
are the sides of the quadrangle or quadrilateral, the other two connecting segments, |AC| = e, 
and |BD| = f, the diagonals, are not counted among the sides. Usually the sequence chosen is that 
which gives a convex quadrilateral, but it can be concave or even as an extreme case reflex. 

If a quadrilateral is not refiex (Fig.), it has a diagonal that divides it into two triangles in such a 
way that the sum of the interior angles of the two triangles is the sum of the interior angles of the 
quadrilateral. To count the angles @ and « as interior angles of a reflex quadrilateral can be justified 
in the following manner. If in a non-reflex quadrilateral a direction vector pointing along one side 
is moved to the end-point of the side and then rotated in a positive direction until it points along the 
next side and if this is repeated until the vector returns to its starting point, it has rotated through 
360°; in detail: 

(180° — «) + (180° — B) + (180° — y) + (180° — 4) = 360°, 
or «o+PB+y-+ 06 = 360°. But with a reflex quadrilateral (Fig.) the corresponding equation is 
(180° — «) + (180° — B) + (180° + y) + (180° + 4) = 720°. This follows from the figure, and 
this equation alone leads to the relation « + 8B + y + «+ @= 360° since @=«e and in A ABS: 
180° — « = 8B + eand 180° — B=a-+e. 


The sum of the interior angles of a quadrilateral is 360°. 
Classification of quadrilaterals. Certain convex quadrilaterals have special names (Fig.). 


trapezium 


trapezoid 


~/ 


parailleia - 
grams 


= / ro =} Bie 4 
rhomboid rhombus rectangle square 
7.5-3 Special types of quadrilaterals 


Classification by lengths of sides: 


general quadrilateral all four sides of different lengths, 
kite two pairs of equal adjacent sides, 
parallelogram two pairs of equal opposite sides, 
rhombus all four sides equal. 
Classification by relative positions of sides: 
general quadrilateral no pairs of parallel sides, 
trapezium one pair of parallel sides, 
parallelogram two pairs of parallel sides. 
Parallelograms 


General remarks. Every parallelogram has two pairs of equal opposite sides and two pairs of 
equal opposite interior angles. If all four sides are equal, the parallelogram is called a rhombus. 
Parallelograms with four equal angles are called rectangles, since then each angle is a quarter of 
360° or a right angle. If all four sides and all four angles are equal, the parallelogram is a square. 
A square is a rhombus with equal angles and a rectangle with equal sides. A parallelogram that is 
neither a rhombus nor a rectangle is sometimes called a rhomboid. The following theorem is a 
consequence of the congruence theorems for triangles (Fig.). 


In every parallelogram the diagonals bisect each other. 
Neighbouring interior angles are supplementary and opposite 
interior angles are equal. 


Symmetry properties. A rectangle has two axes of symmetry 
through the mid-points of opposite sides, a rhombus through 
Opposite vertices. Thus, a square, which is a rectangular rhombus, 
has four axes of symmetry. In every parallelogram the point of 
intersection of the diagonals is a centre of symmetry. 7.5-4 Parallelogram (rhomboid) 


7.5. Quadrilaterals 161 


The diagonals of a parallelogram divide it into two pairs of congruent triangles. 


The diagonals of a rectangle (and a square) have the same length. The diagonals of a rhombus 


(and a square) intersect at right angles and bisect the interior angles. They divide the rhombus into 
four congruent triangles. 


Construction of parallelograms. Since each diagonal divides a parallelogram into two congruent 
triangles, three independent data are needed to determine it, just as for a triangle. For a general 
quadrilateral two further data are needed, that is, five altogether, because the two triangles are no 
longer congruent but have a side in common (the diagonal). For a rectangle or a rhombus only two 
data are needed, and for a square only one. As a rule, a square is given by the length of its side or 
diagonal, a rectangle by two different sides or a side and a diagonal. On the other hand, with the 
rhombus one of the data can be an angle, the other being a side or a diagonal. All constructions 
proceed by triangles in the figure, hence do not differ in principle from constructions of triangles. 


Trapezia 


A trapezium is a convex quadrilateral with (at least) one pair of parallel sides. If the non-parallel 
sides of a trapezium have the same length, it is called isosceles. A trapezium with two pairs of parallel 
sides is a parallelogram. The two parallel sides of a trapezium are called its base lines and the longer 
of the two is called its base. The segment connecting the mid-points E and F of the non-parallel 
sides is called the mid-line m of the trapezium (Fig.). The mid-line is parallel to the base lines 
|AB| = a and to |CD| = c, for otherwise the parallel to AB through E would intersect BC in 
a point F’. By the intercept theorems (with S as the carrier of the pencil) the equation 
|DE|: |EA| = |CF’|: |F’B| would then hold. But by the hypothesis, |DE| : |EA| = |CF|: |FB| = 1, 
and so F= F’, hence EF'|| AB. If G is the intersection of the line through D and F with the 
extension of AB, the triangles BGF and CDF are congruent. Therefore, |AG| = a + c, and by 
considering the triangle AGD one obtains m = (a + c)/2. 


The mid-line of a trapezium is parallel to the base lines and half as long as their sum; m=(a-+c)/2 
(arithmetic mean). 


7.5-5 Distance between the mid-points of the diagonals 
of a trapezium 7.5-6 Trapezium 


The mid-line goes through the mid-points of the diagonals of a trapezium. 


The distance between the mid-points of the diagonals of a trapezium is half the difference of the 
lengths of the base lines; m’ = (a — c)/2. 


Let the base-lines of the trapezium be |AB|=a and |CD| = c; and let M, and M, be the mid- 
points of the diagonals AC and BD. The parallels through each mid-point to the nearer side of the 
trapezium intersect AB in F and G and CD in a single point E, because |DE| = c/2 = |CE| (Fig.). 
Since |DE| = |AF| = c/2 = |CE| = |BG; it follows that |FG|=a—c. Now M,M, bisects the 
sides of the triangle EFG and is therefore half as long as the base, so that |M,M,| = m’ = (a — c)/2. 


From the theorem on parallel lines it follows that the interior angles on each side of the trapezium 
are supplementary. 


The sum of the interior angles of a trapezium on each of the two non-parallel sides is 180°. 


To construct a trapezium it is sufficient to have four independent data, for an isosceles trapezium 
only three are needed. 


In an isosceles trapezium the diagonals have the same length and the angles at the base are equal. 


Kites and deltoids 
A convex quadrilateral with two pairs of equal adjacent sides is called a kite. 


162 7. Plane geometry 


The diagonals of a kite are perpendicular to one another; one 
of them is an axis of symmetry and divides the kite into two 
congruent triangles, the other divides it into two isosceles tri- 
angles (Fig.). A kite with four equal sides is a rhombus. 


A non-convex quadrilateral with two pairs of adjacent equal 
sides is a deltoid (Fig.). Just as with a kite, the diagonals are 
perpendicular to one another and one is an axis of symmetry 
and divides the deltoid into two congruent triangles. The other 
; ae diagonal lies outside the deltoid and is the base of two isosceles 
7.5-7 Kite (left) and deltoid (tight) triangles whose other sides are the pairs of equal sides of the 

deltoid. The deltoid can be used to construct the mid-point of 
a segment lying near the edge of a restricted area of the plane, without going outside that area. 
Three independent data are needed to determine a kite or a deltoid. 


7.6. Polygons 


General polygons 


Closed plane figures with straight edges are called polygons. They can be classified by the number 
of their vertices. Triangles and quadrilaterals are special types of polygons. If two non-adjacent 
sides of a polygon intersect, it is called reflex. If segments connecting any two points inside the 
polygon always lie inside that polygon, it is convex. In that case the interior angle at any vertex is 
less than 180°. Otherwise the polygon is called concave 
and has one or more inverted corners where the in- 
terior angle exceeds 180° (Fig.). 

The segments connecting neighbouring vertices of 
a polygon are called its sides; the other segments 
connecting vertices are diagonals. The number of 
sides is equal to the number of vertices. In a polygon convex reflex concave 
with 7 vertices, an n-gon, each vertex is connected to 
n — 3 other vertices by diagonals, but the diagonal 7-6-1 Various types of polygon 
from the kth vertex to the mth is the same as the 
one from the mth to the kth. It follows that the number of diagonals of an n-gon is n(m — 3)/2. 
For n = 3 the formula states that a triangle has no diagonals and for n = 4 it states that a quadri- 
lateral has two. The diagonals from one vertex of an n-gon divide it into m — 2 triangles, so that 
the sum of the interior angles of a convex n-gon is (n — 2): 180°. This formula gives 180° as the 
sum of the interior angles of a triangle and 2 - 180° = 360° as the sum of the interior angles of a 
quadrilateral. 


Regular convex n-gons 


A regular convex n-gon has n equal sides and m equal angles. A circle can be inscribed in it in 
such a way that the sides of the regular n-gon are tangents to the circle, and another circle can be 
circumscribed around the regular n-gon in such a way that the sides of the n-gon are chords of the 
circle. Since the sum of the interior angles of a convex n-gon is (” — 2) - 180° and the interior angles 
of a regular m-gon are equal, it follows that « = (n — 2)- 180°/n = 180° — 360°/n for each interior 
angle. The radii r from the centre M of the circumscribed circle to the vertices of the regular 
n-gon divide it into n congruent isosceles triangles. The angle at the apex of each of these is 360°/n, 
hence this angle and r together determine the regular n-gon completely. 


The regular hexagon. In this case the six congruent triangles are equilateral, and the lengths of 
all their sides are r. If three vertices, no two of which are neighbours, are connected, the result is 
an equilateral triangle of side r //3. 


Sequences of regular n-gons. Conversely, given an equilateral triangle inscribed in a circle of 
radius r, it is possible to find the vertices of the inscribed regular hexagon by intersecting the perpen- 
dicular bisectors of the sides of the triangle with the circle. This method works for any number 
of sides and makes it possible to construct the regular 2n-gon from the regular n-gon. Thus, beginning 


7.6-2 a) Regular hexagon, b) regular tri- 
angle, c) and regular tetragon 


7.6. Polygons 163 


with an equilateral triangle one can construct regular 
6-, 12-, 24-, ..., 3 + 2"-gons. 


The regular tetragon. The regular tetragon or 4-gon 
is a square. Its corners are the intersections with the 
circumference of two mutually perpendicular dia- 
meters of a circle (Fig.). The method of bisecting 
sides yields the regular 8-, 16-, ..., 2"-gons. 


The regular decagon (10-gon). If two neighbouring 
vertices P and Q of the regular decagon inscribed in 
a circle are connected to the centre, then the angle 
at the centre is 36° and hence for the base angles 
B of the resulting isosceles triangle one has 8 = 
(180° — 36°)/2 = 72° (Fig.). The bisector QR of the 
angle <PQM divides the triangle into two isosceles 
triangles, PQR and QMR. For these |PQ| = |QR| 
= |MR| = S10, the side of the regular decagon. 
Further, |RP| = r — s4o and the triangles PQR and 
QMP are equiangular and thus similar. Hence 7.6-3 Regular decagon 
r: S19 = S19: (Fr — S49). The solution of the quadrat- 
ic equation s?9 + rS;9 =r’ iS S559 = (r V5 — r)/2. A point X is said to divide a segment AB in the 
golden section if |AB|:|AX| = |AX|: |XB|. 


If a regular decagon is inscribed in a circle of radius r, and r is divided in the golden section, then 
the larger of the two resulting segments is the side of the decagon. 


The regular decagon can be constructed by the following method. One draws the diameter AB of the 
circle of radius r and constructs perpendicular to it the radius MD. Let H be the midpoint of AM. 
By Pythagoras’ theorem |HD|? = |DM|? + |HM|? = (r/2)? + r? = (5/4) r?. Thus, |HD| = r 5/2. 
The arc with radius |HD| about H intersects MB in E. Then |ME| = |HE| — |HM| = (r 5 —r)/2 
= S19. Taking alternative vertices of the regular decagon one obtains the regular pentagon (5-gon) 
and thus the whole sequence of regular 5-, 10-, 20-, ..., 5 - 2"-gons. 


The regular 17-gon. The constructions of the regular polygons discussed above and also of the 
regular 15-gon (whose central angle of 24° is constructed as (60/4)° + (72/8)° = 15° + 9°) were 
known to the mathematicians of ancient Greece. It was not until 1796 that any further constructions 
of regular polygons were discovered. In that year Gauss, who had just turned nineteen, proved that it 
is possible to construct the regular 17-gon with ruler and compass. He writes in the introduction 
to his first scientific publication (1 June 1796): 

“It is well known to every beginner in geometry that certain regular polygons, namely those with 
three, five, and fifteen sides and those that can be obtained from these by doubling the number of 
sides, may be constructed by geometrical means. Knowledge had advanced thus far in Euclid’s 
time, and it seems that since then men have been of the conviction that the domain of elementary 
geometry extends no further; at least I know of no successful attempt to expand its boundaries in 
this direction. All the more worthy of notice, it seems to me, is the discovery that many other regular 
polygons, notably that of seventeen sides, may be constructed by geometrical means. 

This discovery is merely a corollary to a far-reaching theory which is not yet quite complete and 
on completion will be presented to the public.— C. F. Gauss of Brunswick, student of mathematics 
at G6ttingen.” 

The theory of which Gauss speaks is his theory of the cyclotomic (Greek: dividing the circle) 
equation x" — 1 = 0 (m a natural number). The th roots of unity, the roots of this equation, lie 
in regular intervals along the circumference of the unit circle of the Gaussian number plane, whose 
centre is the intersection of the real and imaginary axes and whose radius is 1. Gauss showed that 
circle can be divided into m equal parts with ruler and compass alone if n is a prime number 
of the form 


2*11 (k=0,1,2,3,...). 


And indeed the prime numbers 3, 5 and 17 can be obtained for k = 0, 1 and 2. 
Gauss gave to his pupil GERLING the following real representation of cos y for y = (360/17)°, 
which contains only rational numbers and square roots: 


cos p = —*/y6 + A/16 VIT + 816 VG4 — 2 Y17) 
+ !/5 V(17 + 3 V17 — VG4—2 V17) — 2 ¥G44+ 2 V17)]. 


164 7. Plane geometry 


Survey of regular convex polygons (r = radius of circumscribed circle) 


2 
3 | 120° ry3 2r - 2.59807621... = y3 ss 1.29990 r? 
4{ 90° rV2 2r - 2.828427 12... 2r? 
5| 72° > Vo — 2/5) 2r - 2.938926 26... as 2.3776 r? 
6} 60° r 2r-3 
8| 45° rV2—Yy2) 2r - 3.061 46746... 2r3 /2 = 2.8284 r? 
3 
10} 36° = (V5 — 1) 2r- 3.090 169 994... ~ Vo — 2/5) = 2.9389 r3 
12 | 30° rV2 — 3) 2r - 3.105 828 54... 3r? 
15 | 24° — Vi7 — V5 —V@o — 6Y5)] 2r - 3.118675 36... — VI7+V5—YVGo + 6/5) 
me a 0505 r? 
16 | 22° 30’ rVi22—V2+4+Y)2)) 2r - 312144515... 4r2 V(2 — V2) = 3.0615 r? 
17, | 21° 10’ 35°/,,” = 0.367499 04 r 2r - 3.123 74180... = 3.0706r? 
20 | 18° rVi2— Vs + Y'5)/2)} 2r - 3.128 689 30... = 5 V6 — 2/5) = 3.0902 r? 
24] 15° rVi2 —~V2+y3) 2r- 3. 132 628 61 ... Fe V2 — V3) = 3.1058 r? 


in general } 53, = V[2r? — r /(4r? — s,,2)] 


7.7. Mensuration of figures bounded by straight lines 


Measurement of area 


The basic unit of area is the square 
metre m7. It is defined as the area of 
a square of side 1 metre. Further 
units derived from the square metre 
are: 


In the Anglo-Saxon countries areas are still based on the square yard, 1 sq. yd. ~ 0.8361 m? or 
1 m? & 1.196 sq. yd. From this one obtains: 


1 square mile = 2.59 km? or 1 km? & 0.3861 sq. miles = 0.3861 mi?, 
1 square foot = 929 cm? ~ 0.0929 m? or 1 m? & 10.76 sq. ft. = 10.76 ft.?, 
1 square inch - 6.452 cm? or 1cm? & 0.155 sq. in. = 0.155 in?. 


The most common derived unit is the acre, one acre = 4840 sq. yd. ~ 3377.844 m?. 


Mensuration of simple figures 


Squares. A square of side a can be completely covered by unit squares. One obtains a strips, each 
containing a unit squares, hence a: a = a? unit squares (Fig.). 


sR Ge CC crs er 
feel 


a=6cm 


| aed Re 


tcm*| —g=6em 


Rectangles. A rectangle of sides a and 5 can 
be covered by a strips, each containing 5 unit 
squares (Fig.). Therefore its area is a- b units. ~— a 

In both cases it is assumed that there exist unit squares whose sides are commensurable with the 
sides of the square or rectangle to be measured. But the formulae also hold if the side lengths are 


7.7. Mensuration of figures bounded by straight lines 165 


arbitrary real numbers. If a and b are rational multiples a = (p,/q,)-e and b = (p2/q2)- e of the 
unit length e, then the area can be covered by squares of side length e’, where e = [q,, q2]- e’ and 
[91> 92] is the least common multiple of g, and q2. 

In Chapter 3. it was shown that every real number can be approximated to any required accuracy 
by rational numbers. Computation of areas of figures whose boundaries can be described by conti- 
nuous functions in a suitable coordinate system is possible by means of the integral calculus. 


Parallelograms. A general parallelogram can be transformed into a rectangle of the same area 
by removing a right-angled triangle from one side and replacing it on the other (Fig.). If the height 
of the parallelogram is defined as the length of the perpendicular from one side to the opposite 
side, then the area is given by the product of the length of a side and the corresponding height: 


A=a'h,=b-hy. 


The area of a parallelogram is the product of the length 
of one side and the corresponding height. 


7.7-3 Area of a parallelogram 


Triangles. Any triangle can be regarded as half parallelogram (Fig.). Therefore the area of a triangle 
is half the product of the base and the height: 4 = c-h,/2. If in a right-angled triangle the base 
is chosen as one of the two shorter sides, the height is the other. Thus, if a and b are the two shorter 
sides, the area is a - b/2. The height of an equilateral triangle of side a is a |/3/2 (this is a corollary 
to Pythagoras’ theorem); A = a? 3/4. 


The area of a triangle is half the 
product of base and height. 


7.7-4 Areaofatriangle 4 c 


The formula of HERON gives the area of a triangle in terms of its sides alone: 
A = y[s(s — a) (s — b)(s—c)], where 2s=a+b+c. 


Probably this formula was already known to ARCHIMEDES. A Heron triangle is one in which all the 
sides and the area can be expressed as rational numbers. For instance, if a = 13, b = 14andc = 15, 
then A = y[21 - (21 — 13) (21 — 14) (21 — 15)] = YQ1-8-7- 6) = (42 - 3? - 77) = 84 units of 
area. The area of a triangle is also dependent on its angles and the radii of the circumcircle or the 
incircle (see Chapter 11. — Further theorems and Applications). 


Trapezia. To find the area of a trapezium ABCD it is reflected in the mid-point of one of its 
(non-parallel) sides (or rotated 180° about that point, which is the same thing) (Fig.). 

The resulting parallelogram AD’A’D has twice the area of the trapezium. The length of one side 
is a +- c, and the corresponding height h is the height of the trapezium. Therefore A = (1/2):(a+c) A. 


D a a a’ Since the length of the mid-line of the trapezium is 
PN : m = (a + c)/2, one can also write A = 
Lae : 
A Ea aS Ue The area of a trapezium is the product of its mid- 
line and its height. 7 


7.7-5 Area of a trapezium 


166 7. Plane geometry 


If in a trapezium a = c, then it is a parallelogram and the formula reduces to A = (a + a) h/2 
= a-+h, which is the formula for the area of a parallelogram. If c = 0, the trapezium degenerates 
to a triangle, and A = (a + 0) A/2 = a: h/2 becomes the formula for the triangle. 


Kites. Kites are divided by their diagonals e and f into four right-angled triangles (Fig.). This 
gives the formula A = e-: f/2, in which only the diagonals occur. The formula is also valid for a 
rhombus, which is a special case of a kite, and for a square with diagonal d it becomes A = d?/2. 


The area of a kite is half the product of the lengths of the diagonals. 


8 


By 


8 
B; Bs 7.7-7 Area of a general multilateral 


General polygons. It is not customary to set up formulae for the areas of general polygons except 
for regular polygons. The area can be calculated by subdividing the polygon into triangles or trapezia. 
Frequently it becomes necessary to calculate further auxiliary data for the heights and sides of the 
triangles. 

It is preferable to divide the polygons in such a way that the formulae given above can be used. 
What is decisive in practice is not so much the ease of calculation as the accuracy with which the 
required data can be measured. 

The figure shows an irregular polygon subdivided into triangles and trapezia. 


Area theorems for a right-angled triangle 


The theorem of Pythagoras. Owing to its central role in both calculations and proofs in plane 
geometry this theorem rightly ranks as one of the most famous of the subject. Its discovery is usually 
attributed to PyrHaGcoras of Samos (about 580-496 B. C.), but this is certainly not strictly true. 
Historically verifiable details of Pythagoras’ life are very sparse. There are, however, many myths 
and legends about him and his life, and his theorem has even 
inspired poets. 


Theorem of Pythagoras. In a right-angled triangle the 
square on the hypotenuse is equal to the sum of the squares 
on the other two sides (Fig.). 


More than 100 different 
proofs of this theorem are 
known, of which the shortest 

qa is probably the following. 
From the figure it is obvious 
that the area of the large 
square (a + b)? is the area of 
the yellow square c? plus the 
area of the four red triangles 


7.7-9 A proof by dissection 


7.7-8 The theorem of PYTHAGORAS 


7.7. Mensuration of figures bounded by straight lines 167 


4ab/2 = 2ab. Thus, 
(a+ b)? =c?+2ab or a*+2ab+ b*=c?*+ 2ab andhence a? + b? = c?, 
The theorem of Euclid. The classical proof of Pythagoras’ theorem is based on the following 
theorem of Euclid: 


In a right-angled triangle the square on one of the two shorter sides has the same area as the rect- 
angle whose sides are the projection of that side onto the hypotenuse and the hypotenuse itself (Fig.). 


. 7.7-11 Proof of the theorems 
7.7-10 Equal areas by theorem of EUCLID of EucLIp and PYTHAGORAS 


The triangle ABD has the same base and the same height |AC| as the square ACED so that it has 
half the area of the square (Fig.). If it is rotated by 90° about A, then D becomes C and B becomes F. 
The new triangle AFC is congruent to ABD. It has the same base |AF| and the same height |AH| 
as the rectangle AFGH. Thus, it has half the area of the rectangle, and the areas of the rectangle 
and the square must be equal, in other words, b? = g-c, and similarly a? = p - c. If these two ex- 
pressions are added, one obtains Pythagoras’ theorem: 

a2?#+ $6? =cp+cq=cedpt+q=—c'c=c?’. 

Some of the many applications of Pythagoras’ theorem are: 
the computation of the areas of regular polygons, of the 
distance between points in analytic geometry, of the height of 
an equilateral triangle, or of the height of a tetrahedron. 


The altitude theorem. A further interesting theorem on 
right-angled triangles is the altitude theorem: 
In a right-angled triangle the square of the altitude on the 
hypotenuse is equal in area to the rectangle on the two seg- 
ments of the hypotenuse. 


The proof uses the theorems on similarity (Fig.). Since the 
altitude |CD| divides ABC into two equiangular, hence similar, 
triangles A ADC ~ ACDB, it follows that 


q:h=h:p or h?=p-q. 7.7-12 The altitude theorem 


Transformation of areas 


Every convex polygon can be transformed into a square of equal area. Since triangles with the 
same base and height have the same area, the vertex S, of the polygon can be moved parallel to 
the diagonal d, = |S,S,| to a point S; on the line S,_,5S,. Now the area of A S,S15S2 is equal to that 
of AS,S;S2, so that the polygon has been transformed into an (m — 1)-gon of equal area (Fig.). 
This process can be continued until the result is a triangle ABC (Fig.) of area equal to that of the 
original polygon. Its altitude is hp =|B’C|. The rectangle CDEF with the sides |FC|= hp/2 and 
|C:D| = |AB| has the same area as the triangle. If one takes |FC| = p and |EF| = q as the segments 


168 7. Plane geometry 


of the hypotenuse (intersected by the altitude) of a right-angled triangle FGK, then the square 
CIHK has the area h? = p-q (by the altitude theorem), and thus the area of the original poly- 
gon. To find the triangle FGK construct |CG| = |CD| on the line through F and C and draw Tha- 
les’ circle on the diameter | FG]. 

Transformations of other areas can be obtained from the following theorem on supplementary 
parallelograms. Like the preceding theorems, this is also contained in the first book of Euc.Lip; 
it states: 


If from an arbitrary point on the diagonal of a parallelogram the parallels to the sides of the paral- 
lelogram are drawn, then of the four resulting parallelograms those two that are not intersected by 
the diagonal have equal area. (They are called supplementary parallelograms.) 


7.7-13a) Transformation of the n-gon 
S,S, ... S, into a (n — 1)-gon of 
equal area 


7.7-14 The theorem on aoe 
supplementary parallelograms 7.7-13b) Transformation of a pentagon into a square of equal area 


The proof is as follows (Fig.). The diagonal AC divides the parallelogram into congruent triangles 
ABC and ACD. Since JG || AD and AB|| EH, it is clear that |AJ|=|EF| and |JB| = |AE]|. There- 
fore the triangles AJF and AEF are congruent. Similarly FCG and FHC are congruent. If these 
congruent triangles are taken away from the congruent triangles ABC and ACD, the two par- 
allelograms in question remain. Hence they must have the same area. 


7.8. Similarity 


The concept of similarity 


The ratio of similarity. Similarity is the geometric relationship between figures having the same 
shape, but not necessarily the same dimensions (Fig.). Similar figures can be transformed into one 
another by a one-to-one geometric transformation 
that preserves the angles of the figures. An 
equivalent definition is the following: 


In similar figures segments of one are in a fixed 
ratio to the corresponding segments of the other. 


7.8-1 Similar triangles 4° 


For example, if a triangle ABC is mapped to a triangle A’B’C’ in such a way that « = a’, B = p’ 
and y = y’, then these triangles are similar (notation A ABC ~ A A’B‘C’) (Fig.). The equality of 
the angles also implies that a’: a = b’: b = c’: c= k. This constant ratio k between the correspond- 
ing segments of similar figures is called the ratio of similarity. If k > 1, the image is larger than the 
original and the transformation may be called an amplification. If k = 1, the image is congruent 
to the original and the transformation is a congruence transformation. For 0 < k < 1, the image 
is smaller than the original and the transformation is a contraction. 


7.8. Similarity 169 


Perspective. Two similar figures can be moved by congruence transformations into special relative 
positions in which corresponding segments are parallel. In this position they are said to be in per- 
spective. If two figures are in perspective, the transformation mapping one onto the other may be 
obtained from a pencil of (possibly parallel) lines. The carrier of the pencil is called a centre of 
perspective. 

From the ratios given above for similar 
triangles it is easy to derive the following 
proportions: a:b=a':b’; a:c=a':c’; 
b:c= b’: c’. These relations are often 
expressed by the continued proportion 
a:b:c:..in=@:b:c':...:n, from 
which one reads off that in similar figures 
the ratio between two segments of one is 
equal to the ratio between the correspond- 
ing segments of the other. 


The intercept theorems 
The following theorems are immediate | 
consequences of the definition of similarity. 7.8-2 Figures in perspective 
If the lines of a pencil are intersected by two parallel lines, then 
a) the segments on any one line of the pencil are in the same ratio as the segments on any other; 
b) the segments on the two parallel lines between any two fixed lines of the pencil are in the same 
ratio at to segments cn'any Hee of the pemcil Between eaeh ofthe, panies aye meee OF The 
c) the segments on the two parallel lines between any two lines of the pencil are in the same ratio as 
the segments between any two other lines of the pencil. 
Thus, in the figure the following proportions hold: 
by a) |SA| : |AB| = |SC| : |CD|, |SA] : |SB| = |SC| :|SD|, |SA’| : |SB’] = |SC’|: |SD’|, 
|S:A’| : |A’B’| = |SC’|: |C’D’|; 
by b) |AC|: |BD| = |SA]: | SBI, |A’C’|: |B’D’| = |SA’|: |SB’|; 
by c) |A,Ap)| : |B, B2| = |42A3| > |B, B3| = |43A4| : | B3B,| . 


Fs —-7.8-3. The 
S} Is, IS ‘54 intercept theorems 


These theorems are concerned with the possibility of comparing the lengths of two segments 
and are closely related to the classical problem of commensurability. Two segments are commensurable 
if by choosing a suitable unit of measurement they can both be made to have rational lengths. The 
concept of commensurability is historically important because in classical mathematics only rational, 
not arbitrary real numbers were admitted for the computations. The theorems are widely used in 
many proofs, construction and measur- 
ing processes. An example is the tri- 
angular gauge. In the example of the 
figure, the theorems yield the following 
proportions: 


case 1: x: 1 = 5.2: 10 
or x = 0.52 cm, 
case 2: x: 1 = 6.3: 10 
or x = 0.63 cm. 


7.8-4 Wedge gauges 


170 £7. Plane geometry 


The theorems can also be used to 
measure heights (say of a tree) or di- 
stances (such as the width of a river) 


(Fig.). 


7.8-5 Measurement of width and height 
by the intercept theorems, 

|FA|: |BE| = |AB|:|ED\| or 

|AB\|: |AD| = |BC|:|DE\, respectively 


Theorems on similarity 


In the definition of similarity it is required that either all the angles of the figures or all the ratios 
between corresponding segments are equal. The similarity theorems state that if certain angles and 
ratios satisfying certain conditions are equal, then so are all the others. 

There are four principal theorems. 


Two triangles are similar if 
1, two angles of one are equal to the corresponding angles of the other, 
2. the ratio between two sides of one is equal to the ratios between the corresponding sides of the 
other and the enclosed angles are equal, | 
3. the ratio of two sides of one is equal to the ratio of the corresponding sides and the angles opposite 
the larger sides are equal, 
4. the ratios between two pairs of sides of one are equal to the ratios of the corresponding pairs of the 
other. 


Since actual lengths are not important in questions of similarity, but only the ratios between 
them, the hypotheses of the theorems require one datum less than the corresponding congruence 
theorems. 

The similarity theorems are of great importance in the proofs of other theorems of plane geometry, 
for instance, that the medians of a triangle intersect in a point which divides them in the ratio of 
221, 


In a right-angled triangle the length of the altitude on the hypotenuse is the geometric mean of the 
segments into which it divides the hypotenuse. 


C In the figure the altitude AH divides the triangle ABC into the smaller 
b . triangles AHC and BHC. A AHC has <.CAH in common with A ABC and 
they both have right angles, so that they are similar by the second similarity 

theorem. Analogously A BCH ~ A ABC. Consequently AAHC ~ A BHC. 


In a right-angled triangle the length of one of the shorter sides is the geo- 


7.8-6 metric mean between the lengths of the hypotenuse and of its projection onto 
Theorems on right- the hypotenuse. 
angled triangles 


From AAHC ~ AABC it follows that |AH|:|AC| = |AC|:|AB| and from A BHC ~ AABC it 
follows that |BH|: |BC, = |BC|: |AB|. 

If this is rewritten as g: b = b: c or p:a = a: c, and then transformed to b? = qc and a” = pc, 
it becomes evident that this theorem is equivalent to the theorem of Euclid. Similarly it follows 
immediately that |AH|:|CH| = |CH|:|HB| or g:h=h: p. The analogue to the altitude theo- 
rem is therefore: 


The altitude on the hypotenuse divides a right-angled triangle into two triangles that are both 
similar to the original triangle and hence to each other. 


Division of a segment 


Internal and external division. The theory of similarity can be used to divide any segment in any 
rational proportion A = m:n. It is the convention to make A > 0 if the dividing point is between 
the end-points of the segment. This is called an internal division. One speaks of external division if 
the point is outside the segment and then takes A< 0. Thus, A is always taken as the ratio 
|AT| : |7B| = m: n, where A and B are the end-points of the segment and T is the dividing point. 

T is constructed by drawing a line through A (not containing the segment) and marking off on 
it a segment |AC| = mu, where u is any suitable unit of length. On the parallel to AC through B the 
two segments |BD| = nu and |BE| = nu are marked off on both sides of B using the same unit of length 


7.9. Circles 171 


u (Fig.). The intersections 7; and 7, of the lines CD and CE with AB divide the segment internally 
and externally in the ratio m:n. The points 7; and 7, are said to divide the segment harmonically, 
because the ratios of the partial segments are equal, and on an oriented line have opposite signs. 

The ratio A can also be an arbitrary real number if arbitrary, possibly incommensurable, segments 
|AC| and |BD| = |BE| are admitted. If m and n are integers, the construction can also be used to 
divide AB into m + n equal intervals. 


7.8-7 Division of a segment 


7.8-8 The golden section, |BM| = |AB|/2, 


Golden section. It is also possible to divide a segment in such a way that the ratio of the whole 
segment to the larger part is equal to the ratio of the larger part to the smaller part, that is, |AB| : |AT| 
= |AT|: |TB|. The figure shows the construction. By the secant-tangent theorem |AB|? = |AS| - |AQ| 
or |AB|: |AQ| = |AS|: |AB|. Since |AS| = |AT| and |AB| = |SQ|, subtraction of corresponding 
quantities gives |AB|:(|AQ| — |SQ|) = |AS|:(|AB| — |AT|) or |AB|: |AT| = |AT|:|TB|. This 
division is called the golden section. It is of historical importance in aesthetics, because it has fre- 
quently been held that a condition for ideal beauty of figures (including the human form) is that 
the various parts should have the proportions of the golden section. 


7.9. Circles 


Notation 


A circle is the set of all points in a plane whose distance from a given point is a fixed constant 
length. To distinguish between the disc bounded by the circle and the circle itself, this boundary 
is called the circumference of the circle. The point equidistant from all the points on the circle is 
called its centre. A straight-line segment 
(or interval) from the centre of a circle to 
a point on the circumference is called a —circumference 
radius. Any interval between two points on : 
the circumference lies entirely inside the 
circle, so that the circle is a convex figure. 
A line through two points on the circum- 
ference is called a secant and the interval 
between them a chord. Chords containing 
the centre of the circle, the so-called dia- 
meters, are the longest chords in any circle. 
Lines that have only one point in common 
with a circle are called tangents. The seg- 
ment of the circumference between two 
points on it is called an arc. Angles whose 
arms are secants and whose vertex lies on 
the circumference are said to be subtended 
by the arc (or chord) between their arms. 
Angles whose vertex is the centre and 
whose arms are radii are said to be 
subtended at the centre by the arc (or chord) 
between their arms. The portion of the disc 
between two radii is called a sector, and  7.9-1 The parts ofa circle 


secant— 


_ diameter 


172 7. Plane geometry 


the portion of a sector between the chord connecting the two end-points of the radii and the circum- 
ference of the circle is called a segment of the circle. (To avoid confusion straight-line segments will 
be called intervals in this section.) (Fig.) 


Any angle subtended at the circumference of a circle is half the angle subtended at the centre by 
the same arc. 


The proof distinguishes three cases (Fig.), depending on whether the centre M lies on one of the 
arms of the angle (left), between its arms (middle), or outside them (right). In the first case A AMS, 
is isosceles, because |AM| = |MS,| = r. Thus {AS,M = { MAS, = BP. Since the central angle « 
is an external angle of A AMS,, it follows that « = 28. The two other cases are reduced to the first 
by drawing the diameters SD and S3E. 


7.9-2 Angles at the circumference and the centre 


Every central angle has a unique arc and vice versa. An angle 
at the circumference has a unique arc, but an arc subtends 
infinitely many angles at the circumference. 

Angles at the circumference subtended by the same arc are 
equal. Angles subtended at the circumference by a semicircle (or 
diameter) are right-angles. 


If the vertex of an angle at the circumference is moved from 
A to B, then as it approaches B, one of its arms approaches 
the tangent at B and the other the secant through A and B 
(Fig.). 

The angle between a chord and the tangent at one of its end- 
points is equal to the angle the chord subtends at the circumference. 
7.9-3 Angles at the circumference 


If the perpendicular ML is drawn to the chord AB, then and the angle between a chord 
XS ABT = 90° — [90° — a/2] = a/2 = B. and a tangent 


Theorems on tangents to a circle 


A radius of a circle and the tangent through its end-point are perpendicular to each other. Con- 
versely, the perpendicular to a radius in its end-point is a tangent. 


The figure consisting of a circle and a tangent is symmetrical about an axis. The axis of symmetry 
is the line through the centre M of the circle and the point B at which the tangent touches the circle. 
The figure consisting of a circle, a point P outside the circle and the tangents from P to the circle is 
also symmetrical about the line connecting P to the centre M of the circle. This line is called the 
central line of the figure (Fig.). The following statements are consequences of this symmetry. 


I. The central line bisects the angle between the two tangents from a point to a circle. 

2. On the two tangents from P to the circle the intercepts between P and the points at which they 
touch the circle are equal. 

3. The central line is the perpendicular bisector of the chord connecting the two points at which 
the tangents touch the circle. 


7.9. Circles 173 


Z 


7.9-5 The tangents from P 
7.9-4 Tangents to a circle to a circle about M 


Constructions. The above theorems on circles and tangents are the basis of all constructions 
involving tangents. 

To construct a tangent froma point outside a circle to that circle, another circle of radius |HP| is 
drawn with centre in the mid-point H of the central line |MP|. This intersects the circle in B, and 
B,. Then PB, and PB, are the required tangents (theorem of Thales) (Fig.). 

To construct a tangent at a point B on the circle. The radius BM is produced through B to C with 
|BC| = |BM|. With C and M as centres arcs of radius greater than |MB| are drawn. The line con- 
necting their intersections D, and D, is the required tangent (Fig.). 


OQ, 


Lz 


7.9-6 The tangent at B to a circle about M 


7.9-7 External tangents to two circles 


To construct the external common tangents to two circles. Let M, and M, be the centres of the 
circles, and their radii r, and rz (r; < r2). With M, as centre a circle of radius r, — r, and the tan- 
gents from M, to this circle are drawn. The parallels B, B, and B;B; to these tangents at a distance 
r, are the required tangents (Fig.). 

These tangents give the shape of a belt drive around two wheels of radius r, and r2, respectively. 

To construct the internal common tangents to two circles. With My, as centre a circle of radius 
r; + r2 and the tangents from M, to this circle are drawn. The parallels B,B, and B;B, to these 
tangents at a distance r, are the internal tangents (Fig.). 

In this case the shape is that of a reversing belt drive. 


7.9-8 Internal tangents to two circles 


7.9-9 A circle with 
inscribed and circumscribed hexagons 


Computations for a circle 


The circumference. It is possible to give bounds for the circumference of a circle of diameter d 
by inscribing and circumscribing polygons (Fig.); for example, the circumference c, = 3d of a 


174 7. Plane geometry 


regular hexagon is a lower bound for the circumference c of the circle, and the circumference 

ce = (2 V3) d < 3.47d of the circumscribed hexagon is an upper bound, that is: 3d < c < 3 Ald. 
The factor by which d must be multiplied to ob- 

tain c is denoted by the Greek letter 7:c=2-d. 


This number is one of the most important and interesting mathematical constants. One can find 
arbitrary accurate approximations of x by increasing the number of sides of the polygons used. 
ARCHIMEDES used a 96-gon and found bounds that are still frequently used rieseye: His values are 
3195, << 31%/4q Or 3.14084507 < a < 3.14285714. 
The first 40 places after the decimal point are given by 


The following rough calculation shows what an accuracy of “‘only”’ 
30 places of decimals means. A system of stars that astronomers can 7.9-10 The number z to two 
just make visible by hour-long exposures on photographic plates using places of decimals 
the most powerful telescope, emitted the light that is trapped by the plate 
about 2000 million years ago. Since light travels about 9.5-10!2km per year, these stars are about 
2-109-9.5-10!2km = 1.9 - 1077 km away from the earth. The circumference of a circle with this 
enormous distance as radius is c = 2mr = 3.8% - 107? km. If in calculating this circumference in 
kilometres only the first 30 decimal places of 2 are used, the error occurs in the eighth place after 
the decimal point and is of the order of about 2 units. That is, the error caused by disregarding 
further places of x is about 20 micrometres or 0.02 mm. It is obvious that this kind of accuracy is 
never required in practice. The usual approximations are 2 ~ 3.14 or 2 ~ 3'/, for two places of 
decimals, or 2 ~ 3.1416 for four places (Fig.). 

Since z is a transcendental number, no square can be constructed by ruler and compass whose 
area is equal to that of a given circle (the problem of squaring the circle). 


Area. The area of a circle can also be approximated by the areas of inscribed and circumscribed 
polygons, with 2 occurring in the formula. The area of a circle A = ar? = x(d/2)? is proportional 
to the square of the radius (Fig.). 


c=rd — . 

Area of an annulus. The difference of 
the areas of two concentric circles of 
diameters d, and d, > d, is (Fig.): 

A = nd2/4 — nd?/4 
he = (7/4) (da + dy) (da — a). 
é 


7.9-11 Circumference and area 
of a circle 


7.9-12 Annulus 


Area of a sector of a circle. Since the area A of the sector depends on the 
central angle (Fig.) and since « = 360° respectively & = 27 gives the full 
circle, if « is measured in degrees and & in rad, one can set up the proportions: 

A:nr? = «:360° = &:2n— A = (a«/360°) ar? = Gr?/2. 

The length of the arc bounding the sector is calculated by 6: & = r, and 

this can be substituted in the formula for the area. 


7.9-13 
Sector of a circle 


Area of a segment of a circle. The area of a segment is calculated as the difference of the areas of 
the sector and the triangle AMB (Fig. 7.9-14). 
A = br/2 — s(r — h)/2, where s is the length of the chord and h is the height of the segment. 


Area of an arbelos. There are many figures bounded by circular arcs. The properties of the two 
shown here, the arbelos (cobbler’s knife) and the salinon, were investigated by ARCHIMEDES. An 


7.9. Circles 175 


arbelos consists of two small semicircles inside a large one, that such the 
sum of the diameters of the small semicircles is the diameter of the large 
one (Fig.). If the common tangent DC of the two small semicircles is 
drawn, then the circle K on the semichord |DC|= A as diameter has the 
same area as the arbelos. This semichord |DC| can be taken as the altitude 
of a right-angled triangle ABC with |AB| = d as the hypotenuse. Thus, by 
the altitude theorem, h? = q(d — q). Hence, if A, is the area of the circle 
of diameter h, the following equation holds: A, = 2h?/4 = nq(d — q)/4. 
The area of the arbelos is 


Aary = (1/2) (Aas — Aap — Ans) = (2/8) (d? — q* — (d— q)’) 
= nq(d — q)/4. 


7.9-14 Segment of a circle 
7.9-15  Arbelos 


7.9-16 Salinon 


Area of a salinon. A salinon consists of two small 
semicircles of equal diameter e inside a large semicircle 
of diameter d and a further semicircle of diameter 
d — 2e between them on the opposite side of d (Fig.). 
The area of a salinon is equal to that of a circle of 
diameter d — e, which is the sum of the radii of the 
large semicircle and the semicircle on the other side of its diameter. By the figure, the area of this 
circle K and the salinon S are given by 


Ag = n(d — e)"/4; 
As = (Aga — 2A ac a Acp)/2 
= n[d? — 2e? + (d — 2e)?]/8 
= n(d? — 2de + e?)/4 = n(d — e)?/4. 


The lunulae of Hippocrates. There are also 
some famous moonshaped figures. The best 
known of these are the crescents (or lunulae) 
of Hippocrates. By the theorem of Thales the 
paee ABC in the left-hand figure is right- 79-17 The lunulae of Hippocrates 
angled (Fig.); thus c? = a? + 5”. The semicircle 
on |AB| = c has the area _A,, = 2c7/8; the sum of the areas of the semicircles on |AC| and 
|BC| is Agc + Apc = 2(b? + a?)/8 and is thus equal to A,,. From this it follows that: 


The sum of the areas of the two crescents is the area of the triangle. 


Similarly the sum of the areas of the crescents in the right-hand figure is the area of the square. 
Misled by the remarkable fact that 2 does not occur in these formulae many mathematicians of 
ancient times (including Hippocrates himself) continued the hopeless search for a method of sqar- 
ing the circle. 


A c 8 


Theorems on chords, secants and tangents 
By rotating a circle about its centre one sees immediately that: 


Chords of equal length are equidistant from the centre; conversely, equidistant chords have the 
same length. 


If in a circle a chord cis longer than a chord cz, it is closer to the centre than c2, hence the diameters 
are the longest chords. 

If the lines of a pencil intersect a circle, then the products of the two intercepts on each line between 
the carrier of the pencil and the circumference of the circle are equal to a fixed constant. This is a 
summary and generalization of the statements of the three following theorems. 


The chord theorem: If two chords of a circle intersect, then the product of the intercepts on one 
chord is equal to the product of the intercepts on the other. 


176 7. Plane geometry 


The proof is as follows (Fig.): the angles <{B, A,B, and <{B,A,B, are equal, because they subtend 
equal arcs, and so ar? <{_A,B,A, and <A2B,A,. Hence A A,SB, ~ AA2SB, and |SA,|: |SB,| 
= |SA2|:|SB,| or |SA,| + |SB,| = |SAQ| - |SB2|. 


7.9-18 The chord theorem 


7.9-19 The secant theorem 


The secant theorem: If two secants intersect outside a circle, 
then the product of the intercepts between the intersection and 
the circle on one is equal to the product of the two intercepts 
on the other (Fig.). 


Z 


The proof of the relation |SA,| - |SB,| = |SAz2| -|SB2| is completely analogous to the proof of 
the chord theorem. 


The secant-tangent theorem: If a secant intersects a 
tangent to a circle, then the length of the intercept on 
the tangent between the point of intersection and the 
point of contact is the geometric mean of the lengths of 
the intercepts of the secant. 


The proof is obtained by taking this as the extreme 
case of the secant theorem. As one secant tends to a 
tangent, the lengths of the two intercepts on it become 
equal (in the limit) and the theorem reduces to |SA|? 
= |SA,|-|SB,| or t? = a-b (Fig.). The constant pro- 
duct is ed the power of the carrier of the pencil with . 
respect to the circle. 7.9-20 The secant-tangent theorem 


Quadrilaterals of chords and tangents 


Quadrilaterals of chords. If all the sides of a quadrilateral are chords of a circle, it is called a 
cyclic quadrilateral. Then the theorems on angles subtended by an arc at the centre and circum- 
ference yield the following result: 


In a cyclic quadrilateral the sum of opposite interior angles is equal to 180° (Fig.). 


Conversely, if the sum of opposite interior angles of a quadrilateral is 180°, then it is cyclic. 


D 


7.9-22 A quadrilateral of tangents 


Quadrilaterals of tangents. If the 
sides of a quadrilateral touch a 
circle, then (Fig.) |AE| = |AH|, 
|BE| = |BF;, |CF| = |CG| and |DG| 
== |DH|, because the intercepts of 
two tangents from a single point to 
a circle are equal. From this it fol- 
lows that |AE|+ |EB| + |CG| + |GD| 
= |BF| + |FC| + |DH| + |HA|, 
hence |AB| + |CD| = |BC| + |DA| 
7.9-21 A cyclic quadrilateral orat+tc=b-+d. 


In a quadrilateral of tangents the sums of the lengths of opposite sides are equal. 


The validity of the converse can also be seen from the figure. 

These theorems have many corollaries; for instance, it follows that a square has an inscribed 
and a circumscribed circle, that a rectangle has a circumscribed but no inscribed circle and that a 
rhombus has an inscribed but no circumscribed circle. A general parallelogram has neither an 
inscribed nor a circumscribed circle. 


7.10. Geometric loci 177 


7.10. Geometric loci 


A locus is a set of points defined by a rule that makes it possible to decide for any given point 
whether or not it belongs to the set. The locus then contains exactly the points of that set and no 
others. Thus, the locus of all points in space at a fixed distance r from a given point M is the sphere 
of radius r with centre at M. Similarly, the locus of all points in space having constant distance 
from a line is the curved surface of a cylinder with the given line as axis. 

In the plane, loci can be curves such as circles, parabolas or straight lines. Frequently, a point is 
determined as an intersection of two loci. For instance, the Position of the point C whose distance 
from the end-points A and B of a segment c is a or b, respectively, is determined up to a reflection 
in the straight line AB by the condition that it must be in the locus of points of distance a from B 
and also in the locus of points of distance b from A (Fig.). 


7.10-2 Geometric locus: perpendicular 
bisector 


A 


7.10-1 Geometric locus: position 
of the point C 


Py 


7.10-3 Geometric locus: pair of 
parallels and mid-parallel 


Certain elementary loci. 


I. The locus of points equidistant from two given points is the perpendicular bisector of the segment 
between those points (Fig.). 

2. The locus of points at a fixed distance from a given line is the pair of parallels at that distance 
from the line (Fig.). 

3. The locus of points equidistant from two parallels is the parallel midway between them (Fig.). 

4. The locus of points equidistant from two intersecting lines is the pair of bisectors of the vertical 
angles at the intersection (Fig.). 

5. The locus of all points at a fixed distance from a given point is the circle with that point as centre 
and the distance as radius. 

6. The locus of all points at which a given segment subtends a fixed angle is a circular arc with 
the segment as a chord (Fig. 7.10-6). 


7.10-4 Geometric locus: mid-parallel 


7.10-5 Geometric 
locus: angle 
bisector 


If |AB| = s and « is the given angle, then the centre of the circle is the apex M of an isosceles 
triangle ABM with base AB and an angle 2« at the apex. 

These six elementary loci are used to solve many further problems in plane geometry, involving 
loci. 


Applications. Constructions can frequently be reduced to the loci given above, but this may require 
several steps. For instance, the set of centres of circles with two given lines /, and /, as tangents is 
the parallel halfway between them if they are parallel (by 3.), or is the pair of angle bisectors w, 
and w, if they intersect (by 4.) (see Fig. 7.10-5 and 7.10-7). 

New loci result if further conditions are added. For instance, the locus of the centres of all circles 
of radius r that touch a given straight line / at a point P is the perpendicular to lat P (Fig.), and the 
locus of the centres of all circles that touch a given circle of radius 0 is the circle of radius r + Q 


178 7. Plane geometry 


7.10-9 Geometric locus: concentric circles 


7.10-8 Geometric locus: perpendicular 


concentric to the given circle if they touch 
externally, and the concentric circle of radius 
r—go if they touch internally and r<o 


ig.). 

Points are frequently constructed by inter- 
secting two loci. For instance, a gear wheel of 
radius R can be made to move a rack / at 
distance a > R from its centre by interposing 
a pinion of radius r. The axis of the pinion is 
obtained as the intersection of a circle of radius R + r about M, the centre of the gear, and a 
parallel to the rack / at a distance r (Fig.). 


7.10-10 Geometric locus: straight line and circle 


7.11. Planimetric treatment of conic sections 


The ellipse 
The ellipse is the locus of points in the plane for which the sum of their distances from two fixed 
points is a constant 2a. 


The fixed points F, and F, are called the foci of the ellipse and the distances r,; and r2 of P from 
F, and F, its radial distances. The constant 2a must be /arger than the distance between the foci. 
Then the points of intersection P, and P, of two circles of radius r, and r, about F, and F>2, respect- 
ively, where r, + r2 = 2a, both lie on the ellipse. If the foci are inverted, the same process yields 
two further points P; and P, on the ellipse (Fig.). The points P, and P, and P; and P,, and indeed 
the whole ellipse are symmetrical about the line through the foci. But they are also symmetrical 


7.11. Planimetric treatment of conic sections 179 


about the perpendicular bisector of F,F>. 
The intersection C of the two axes of 
symmetry is a centre of symmetry of the 
ellipse and is called its centre. The distance 
of each focus from the centre of the ellipse 
is called the linear eccentricity of the 
ellipse, |CF,| = |CF,| =e. On any line 
through the centre of the ellipse the inter- 
val inside the ellipse is called a diameter. 
The largest diameter is called the major 
axis and its end-points, the principal ver- 
tices V, and V2, have distance a + e and 
a — e from the focal points and a from 
the centre C. The smallest diameter is 
called the minor axis and its end-points 
the subsidiary vertices W, and W, have 
distance a from both foci. Their distance b 
from the centre of the ellipse can be compu- 
ted from the semi-major axis a and the ae 
linear eccentricity e by the theorem of 2a 
Pythagoras: b? = a? — e?. | 


The shape of the ellipse is determined by any two of the quantities a, b and e, for instance, by the 
rectangle of sides 2a and 2b. If b is small, the ellipse is very flat and elongated, in the limit it degenerates 
into the interval |F,F2| = |V, V2| on which every point is counted twice. As 5 increases in relation toa, 
the ellipse becomes more and more like a circle, which is reached for 5 = a, when the rectangle is a 
square, The circle can thus be regarded as an ellipse of linear eccentricity 0 for which a = b= r, 
="ro-—Pr. 

Thread construction. (This construction is used for laying out elliptical flower beds.) Mark the 
foci with two drawing pins (or sticks) and attach a string of length 2a > 2e to them. If a pencil 
or a third stick is hooked into the string and moved in such a way that the string stays taut, then it 
describes an ellipse, for r; + rz = 2a always holds (Fig.). 


a, —_ 
a eel 


7.11-2 Thread construction of the ellipse 


7.11-3 The principle of an ellipse-drawing device 


The ellipse as a perspective-affine image of a circle. The two circle construction of descriptive 
geometry obtains a point P; on the ellipse as the intersection of parallels to the axes through the 
two points A, and B; on a common radius of the concentric circles about C of radius a and 5, respect- 
ively (Fig.). A parallel to CA, through P, intersects the axes in M; and N,. The parallelogram 
CB,P,M, shows that |M,P;| = b, and the parallelogram CA,P,N, shows that |P,N,| = a. Thus, one 
can construct an ellipse in the following fashion. On the edge of a piece of paper mark off P,, M; 
and N, in such a way that |M,P,| = band |N,P,| = a. If M, and N, are kept on two mutually orthogonal 
lines and moved to and fro, then P, describes an ellipse. This ‘paper-strip’ construction illustrates 
the principle on which most devices for drawing ellipses are based: two points M, and N; of fixed 
distance move on mutually orthogonal lines. 


180 7. Plane geometry 


Two diameters of an ellipse are said to be conjugate if they are the images of two orthogonal 
diameters of the circles in the two-circle construction. For instance, CP and CQ are the images 
of the orthogonal radii CP, | CQ, (Fig.). A rotation of 90° therefore takes A CPP, into (CP*Q,, 
that is, P, goes toQ,, P to P*, and P,; toQ,; . The quadrilateralQQ.P*Q, is a rectangle. The extension 
of its diagonal QP* in both directions intersects the axes of the ellipse in XY and V, and the extensions 
of the sides Q,Q, P*Q,2, and QQ, intersect them in R, S and 7, respectively. The figure now contains 
right-angled triangles that have two sides of equal length and equal angles. The congruence 
AXRQ = ACSQ, gives !CQ2| = |XQ| = b, and the congruence A RCQ, = AQTV gives |CQ,| 
= |QV| = a. The Rytz construction relies on a consequence of these equalities, namely the fact that 
|UX| = |UC| = |UV}. 

A second paper-strip construction. If the end-points X and V of the segment a + 5b move along two 
seh axes, then the point Q on the segment for which |XQ| = b and |QV|= a describes an 
ellipse. 

Y 7.11-4 The Rytz construction 


7.11-5 Approximate construction using circles of 
curvature 


The Rytz construction. If the only data are the centre C and two conjugate semidiameters CP and 
CQ of the ellipse, then the axes can be constructed in the following manner. Rotate CP by 90° 
about C to obtain CP*. The circle of radius |UC| with centre at U, the mid-point of QP* intersects 
QP* in X and V. Now CX and CV are the axes and |QV| = a and|QX| = bare the lengths of the 
semiaxes. 

Principal radii of curvature. It is frequently sufficiently accurate to approximate an ellipse by 
circles whose radii are the radii of 
curvature at the end-points of the 
major and minor axes (Fig.). 

Their centres M, and Mg, and 
radii r = |M,B| and @ = |M,A| can 
be constructed using only very few 
auxiliary lines (they can also easily 
be calculated by the methods of 
analytic geometry if the axes of the 
coordinate system are chosen as the 
axes of the ellipse). Draw the rect- 
angle with vertices M(0, 0); A(a, 0); 
R(a, b); and B(O, 5) and its diagonal 
AB(y = —(b/a)x + 5). The per- 
pendicular to this line through 
R(y = (a/b) x — (a? — 6?)/b) inter- 
sects it in S(xs = a3/(a? + b?), 
Ys = b3/(2a + b%)); 
|AS| = b?/V(a? + 6), 
|SB| = a?///(a? + b?) and the axes 
in the required centres M, and Mg. 7.11-6 Tangents to an ellipse 


7.11. Planimetric treatment of conic sections 181 


The radii are calculated by similar triangles. The lines RSM, and ASB are intersected by the 
parallels RB||AM and BMgz,|| RA. Thus, 9:a = |AS|:|SB| = b?:a?, or @=6?/a and 
r: b= |SB|:|SA| = a?: b? or r = a?/b. 

Tangents to an ellipse. If P,; is an arbitrary point on an ellipse with foci F, and F, (Fig.), then 
the continuation of the radius |F2P,| = rz to L, with |P,L,| = |P,F,| = r; gives a point L; whose distance 
from F, is always equal to r; + rz = 2a. The circle of radius 2a about F, is sometimes called the 
leading circle. The perpendicular bisector P,N, of the segment F,Z; bisects the angle at the apex 
of the isosceles triangle F,P,L; and is the tangent ft; to the ellipse in P;, since for every other point 
Q, on this line the triangle F,Q,L, gives |F.Q,| + |Q:L.| > |F2Z,| = 2a, and therefore Q, is not on 
the ellipse. This also provides a new definition for the ellipse. 


The ellipse is the locus of points in the plane that are centres of circles touching internally a given 
circle, the leading circle (of radius 2a and centre F,), and passing through a fixed interior point F,. 
From the figure it can be seen that the radii r,; and rz intersect the tangent t, in equal angles. Hence, 
sound or light waves emitted from one focus of the ellipse are reflected to the other focus. In elliptical 
whispering galleries soft sounds made at F, can be clearly heard at F,, but nowhere else. 

If C is the centre of the ellipse, then N;C bisects the sides F,L; and F,F, of the triangle F, F,L, 
and is hence parallel to the third side and half as long. Hence all the points N, that are the inter- 
sections of tangents with the perpendiculars from F, onto them lie on the circle of radius a about C. 


The feet of perpendiculars from the foci of an ellipse to its tangents lie on the circumscribed circle 
whose radius is the semi-major axis a. 


If one regards the ellipse as the affine image of the circumscribed circle (Fig. 7.11-6), then P, 
is the image of the point A, on that circle for which A,P, is perpendictilar to the major axis. The major 
axis is called the axis of affinity, and the tangent f; to the circle at A, intersects the tangent ¢, to 
the ellipse in P,; in a point 7; on the extended major axis. This point and t; can be used to construct ¢,. 


The area of an ellipse. An affine map x = x’, y = (6/a) y’ transforms the circle of radius a into 
an ellipse with semi-major axis a and semi-minor axis 6. The area 
formula za? of the circle becomes zab. [Area of ellipse | A = xab | 
The hyperbola 


The hyperbola is the locus of points in the plane for which the difference between their distances 
from two fixed points is a constant 2a. 


The given points F, and F, are the foci of the hyperbola. The intervals r,; and r. from a point 
on the hyperbola to the foci are the radial distances of that point (Fig.). The constant 2a must be 
smaller than the distance between the foci. If two arcs of radii r,; and r, with r; — rz = +2a are 
drawn about F; and F,, respectively, their intersections P, and Pz are points of the hyperbola. 
Interchange of the radii gives two further points P; and P,. The line through F, and F2 is an axis 
of symmetry for P, and Pz and also P3 and P,, 
ard indeed for the whole hyperbola. The perpendi- | 
cular bisector of F, Fz is also an axis of symmetry. ao eas P, 
The intersection C of the two axes is a centre of : i 


symmetry of the hyperbola and is called its centre. i - \ 
The distance of the foci from the centre is called H nae \ 
the focal distance or linear eccentricity e=|CF,| | f2 \ 
= |CF,|. The intersections of the hyperbola with i oe, 
its principal axis F; F, are called its vertices V, and je _ mY | 


V,, their distance from the centre is a and from | ae 
the foci e — a and e + a, respectively. | 


[Hyperbola [== @=ae+h | 
gid Sian a \ 


f ‘ 


It is shown in analytic geometry that the se YA i 
hyperbola has two asymptotes, which intersect a » B, 
the perpendiculars at V,; and V, to the principal ‘ 


axis at a distance b from V, and V2, respectively. 


Then 5b can be computed by the formula ry 

ni ees a ar a as 
The hyperbola lies completely between the arms 

of two vertically opposite angles of the asymptotes. 


For the limiting case b= 0, e= a, the two half- 
lines on the principal axis outside |V,V2| =|F,F2| 7.11-7 Hyperbola with r; — 7: = 2a 


182 7. Plane geometry 


each counted with multiplicity 2 form a degenerate hyperbola. On the other hand as b becomes large, 
the curvature of the hyperbola decreases; in the limit b —- oo the two perpendiculars to the principal 
axis in F, and F, are taken to be the degenerate hyperbola. If a = b, the asymptotes are perpen- 
dicular to one another and the hyperbola is called equilateral. 


Thread construction. Suppose that the foci F; 
and F, of the hyperbola and the segment 2a are 
given. A ruler of length / is fixed at one end to F, 
in such a way that it is free to rotate. A thread 
of length k = / — 2a is attached to the other end 
of the ruler and its free end is fastened at F,. 
A pencil is hooked behind the thread and pressed 
against the ruler in such a way that the thread is 
taut. If the ruler is now rotated and the pencil 
moved so that it stays against the ruler and the 
thread stays taut, the pencil describes an arc of a 
hyperbola (Fig.). This follows from the relations 


L+h=1, L+h=k and 
Lh—l=l—k=2a. 


Tangents to a hyperbola. If P, is an arbitrary point 
on the hyperbola with foci F, and F, and the radius 
P,F,| =r2 is shortened by a segment equal to 
P,F,| = 1,, then the point of division L, has ; : 
constant distance |F,L,| = 2a from F,. The perpen- 7.11-8 Thread construction of the hyperbola 
dicular bisector P,N,; of F,L, bisects the angle at 
the apex P; of the isosceles triangle F,L,P, and is the tangent ¢; to the hyperbola in P;, because for 
every other point Q, on the line an inspection of the triangle F,Q,L, shows that |F,Q,| — |Q,Z,| is less 
than Fal) = 2a, so that none of these points can lie on the hyperbola. This also provides a new 
definition for the hyperbola. 


The hyperbola is the locus of the centres of all circles touching externally a given circle, the leading 
circle (about F, of radius 2a), and passing through a given exterior point F;, . 
From the figure it can be seen that: 

A tangent to the hyperbola at a point P, bisects the angle between the radii through that point. 
If C is the centre of the hyperbola, then the line CN, bisects the sides F, F2 and F,L, of the triangle 


F, FL, and hence is parallel to the third side and half as long. Therefore the feet of the perpendiculars 
from a focus to the tangents of a hyperbola lie on a circle about C of radius a. 


The feet of perpendiculars from the foci to the tangents of a hyperbola lie on a circle touching the 
hyperbola in its vertices and having the same centre C as the hyperbola. 


7.11-9 Tangents to a hyperbola 7.11-10 Parabola 


7.11 Planimetric treatment of conic sections 183 


The parabola 
The parabola is the locus of points in the plane that are equidistant from a given point and a given line. 


The point is called the focus F and the line is called the directrix. The distance from the focus to 
the directrix is the semi-parameter p of the parabola (Fig.). Every parallel to the directrix at a distance 
d greater than p/2 is intersected by a circle of radius d about F in two points P; and P, on the para- 
bola. The points, and hence the parabola, are symmetrical about the perpendicular from the focus 
to the directrix. This line is called the axis of the parabola and intersects it in its vertex V, whose 
distance from the focus and the directrix is p/2. It is clear from the definition of the parabola that 
the chord through the focus parallel to the directrix has length 2p. 


Thread construction. The parabola also has a construction using a thread. A set square (or ruler) 
with its shortest side AB on the directrix is free to move up and down; a thread of length |BC| is 
attached to C and its free end is fastened at the focus. If the thread is pushed against the set square 
with a pencil and held taut, and the set square is moved up and down, the pencil describes an arc of a 
parabola, because its distance from the directrix is always equal to its distance from the focus (Fig.). 


7.11-11 Thread 
construction of the 
parabola 


7.11-12 Tangents to 
a parabola 


Tangents to a parabola. If P, is a point on the parabola, its distance |P,L,| from the directrix | 
is equal to |P,F|. The triangle FL,P, is hence isosceles, and the perpendicular bisector P,N, of FL, 
bisects the angle L,P,F and is the tangent t, to the parabola in P,. For if Q; is any other point on 
t,, the distance |Q,Q;| from Q, to the directrix is less than 
\Q,L,| = |Q,F| (Fig.). This again provides a new definition for 
the parabola. 

The parabola is the locus of the centres P, of all circles 
having a given line / (the directrix) as tangent and passing 
through a given point F. 

From the figure it can be seen that the angle between the tan- 
gent at P, and the radius P,F is the same as the angle between the 
tangent and the parallel to the axis through P;. All rays from the 
focus of the parabola are reflected in such a way that they become 
parallel to the axis; vice versa, parallels to the axis are reflected 
into the focus. 

Since the point N, lies on the vertical tangent t, through the 
vertex V and this is perpendicular to the axis, it follows that: 

The parabola is the envelope of the free arms of all right 
angles whose vertices are on the vertical tangent and whose 
fixed arms go through the focus. 


If a tangent t, is perpendicular to t, and they intersect in T, then 
FN,TN;, is a rectangle. Its diagonal N,N, is asegment of the verti- 
cal tangent ¢, and so it must be parallel to the directrix. The tri- 
angles N,N,F and N,NjT are congruent and therefore have the 
same altitude on the base N,N,. Therefore T has the distance 
p/2 from the vertical tangent and lies on the directrix (Fig.). 


a a ; ; . 7.11-13 A pair of orthogonal tan- 
Pairs of orthogonal tangents toa parabola intersect onthe directrix. gents to a parabola 


184 8. Solid geometry 
8. Solid geometry 


8.1. Fundamental concepts ........... 184 GONGEl ktenktn chen sede oenesex es 192 
Lines and planes in space.......... 184 Surface ATE... . ccc ccc cece eee ee 193 
ONGS Geiwen nek e oeewssieas Haas 186 Volume ... cece ccc eccccccccece 194 
Units of measurement ............ 187 Frustum of a pyramid and cone .... 194 
8.2. Cube and cuboid ................ 187 8.5.  Polyhedra ...................... 196 
SUrfACE ATEA «ccc cece ccc ec cece 187 Euler’s polyhedron theorem ........ 196 
VOIME adc cea eaies ce secaeweds 188 Regular polyhedra ............... 196 
Special relations .......0..c0ceees 189 CIYSIOIS 3.642644 aecasewn scouts 197 
8.3. Prism and cylinder .............. 190 86. Sphere......................08. 198 
GONETEL bocce tue ene iadedde dees wx 190 GONETOD bc tabi caneeaen panend age 198 
SUIfACE ATED «1... ccc ccc ewe eens 190 VOLUN occurs tana sew eadebewe ox 199 
Cavalieri’s principle ............4. 191 Surface Area ..... ccc ccc ccc c ees 200 
8.4. Pyramid and cone ............... 192 8.7. Further solids ................... 201 


Solid geometry is a branch of Euclidean geometry. Its subject matter consists of the form, relative 
position, size, and other metric properties of geometric figures that do not lie in one plane. As the 
geometry of three-dimensional space, solid geometry gives a deep insight into the spatial properties 
of objective reality. 

In certain parts of solid geometry a restriction to a single plane is possible. It therefore has strong 
connections with plane geometry. Furthermore, in solid geometry the methods of descriptive geometry 
are often used. Finally, in numerical solid geometry, arithmetical and algebraic operations are applied. 


8.1. Fundamental concepts 


Lines and planes in space 


Points, lines, and planes are the foundation stones of elementary geometry in three-dimensional 
space; in particular, the bounding surfaces of geometric solids are often parts of planes. The in- 
tuitive interpretation of the fundamental concepts of point and line, given in plane geometry, must 
be supplemented by the fundamental concept of plane and the relative position of lines and planes 
in space. 


The plane. The family of lines through a fixed point A 
that intersect a line /, not passing through A, or are 
parallel to /,, form a plane (Fig.). A plane in space 
can also be regarded as generated by a line / that is 
given a parallel displacement along a line /, intersect- 
ing /. Hence the position of a plane in space is uni- 
quely determined by the following subsets: 1. a line /, 
and a point A not lying on /,; 2. two intersecting 
lines / and /,; 3. two parallel lines; 4. three points 
not lying on one line, for example, A and two points 
that fix the position of /,; 5. a point A and a vector 
(the normal vector n of the plane). 


8.1-1 Formation of a plane in space 


Relative position of a line and a plane in space. A line / lies entirely in a plane E£ if it has two points A 
and B in common with it; it is parallel to £ if it has no point in common with E or lies entirely 
in E. A line / cuts the plane if it has exactly one point in common with 
it, the point of intersection L. It is perpendicular to the plane at L if it / 
is perpendicular to two distinct lines /, and /, of E. If a line / that 
cuts E at L is projected perpendicular to E, the normal projection /’ of 
! on E is obtained. The angle of inclination « of / to E is defined 
by « = (1,1) (Fig.); if 11 E, then « = 90° and if g||£, then 
a= 0°. 


Relative position of lines in space. If two lines /, and /, are parallel 
or intersect in a point, then a plane can be drawn through them, and 7 
the distance or angle between them can be determined by the methods —8.1-2 Angle of inclination 
of plane geometry. of a line / to a plane E 


8.1. Fundamental concepts 185 


Two skew lines |, and |, (Fig.) are not parallel and have no common point. Their angle of in- 
clination « is defined as the angle between one of the lines and the line through one of its points 
parallel to the other, for example, /3 || /. through N on /,. The line m,2 perpendicular to /, and /, 
at N is perpendicular to /,. The plane E spanned by /; and 7,2 cuts /, in D2, and the line through 
D, parallel to n,2 cuts /; at D,. The line D,Dz2 is the common perpendicular of |, and /,, and the 
length d of D,D, is the distance between the skew lines. It is the shortest distance between any point 
of /, and any point of /,. 


8.1-4 One-sheet 
hyperboloid of 
rotation 


8.1-3 Distance between skew lines 


For any line a in space there are arbitrarily many lines / that make a given angle « with a and have 
a given distance d from a. If the choice is restricted to those lines / whose common perpendicular 
with a lies in a plane E perpendicular to a, two families of lines are obtained. They are the generators 
of a one-sheeted hyperboloid of rotation with a as axis of rotation (Fig.). The plane E cuts the hyper- 
boloid in its smallest circle and the axis in the centre of this circle. Furthermore, each line of one 
system of generators cuts each line of the other system (except the one that is parallel to it), while 
any two lines of one system are skew. In particular, if « = 0, then the surface is a cylinder of rotation, 
which has just one system of generators, parallel to one another. 

The set of those lines / that cut a fixed line a at a point Z of a and at a fixed angle « generate a 
cone of rotation. Z is the vertex, a the axis, and the lines / are the generators of the cone. 

The set of lines through a point P in space form a bundle of lines. An arbitrary plane through P 
cuts this bundle in a pencil of lines. If P is a point at infinity (improper point), then the corresponding 
systems of lines are a parallel bundle of lines in space and a parallel pencil of lines in a plane (see 
Chapter 25.). 


Relative position of planes in space. Two planes in space have at most one line in common if they 
do not coincide. Two planes that have no common point or coincide are said to be parallel. 

The set of planes that contain a line s form a pencil of planes, whose carrier is s (Fig.). A plane JT 
perpendicular to s cuts the planes of the pencil in lines. The angle of intersection of lines in JT is 
by definition equal to the angle of intersection of the corresponding planes, for example, 
B= XL(,, 2) = L(E1, E2). Hence the angle of intersection of two planes is reduced to the angle 
of intersection of two lines. 


8.1-5 Pencil of planes with 
a line s as carrier 


8.1-6 Parallelism of the lines 
of intersection C43, Cys, C23 
of three planes £,, E., EF; 


If B = 90°, the planes are perpendicular. A plane E, parallel to the line of intersection c,3 of 
two planes E, and E; cuts these in parallel lines, c,2 || ¢13 || c23 (Fig.). A parallel pencil of planes 
consists of planes such that none of them has a point in common with any other (Fig.). A line 
perpendicular to one plane of a parallel pencil is perpendicular to each of the planes. The segments 
cut on the perpendicular line determine the distances of the corresponding planes from one another ; 
for example, |P2P3| is the distance between the planes E2 and E;. 


186 8. Solid geometry 


8.1-7 Pencil of parallel 
planes 


8.1-8 Bundle of planes 
and solid angle 


A bundle of planes is represented by the set of all planes of space that have a point S in common. 
S is the carrier of the bundle. Any two of these planes intersect in a line that passes through S, 
for example, EF, and E> in c,2 (Fig.). Three planes E,, E,, E; of a bundle whose lines of intersection 
C12, C13» C23 have only the carrier S in common divide the space into eight three-edged or three- 
faced solid angles with the common vertex S. The vertex divides each line of intersection into two 
half-lines, and any three half-lines of different lines form the edges of a solid angle. 

The angle between two edges, the so-called edge-angle, is measured in the plane spanned by these 
edges, for example, «, = (c12, C13). The face-angle is measured in a plane perpendicular to the line 
of intersection of the two planes, for example, 8, = <{(/2, /3) between E, and E3, where /2 | c23 
and /, | c23 (Fig.). In general, the edge-angles « and the face-angles # are different. Three-edged 
solid angles whose edge-angles and face-angles are all right angles occur at the vertices of a cuboid. 
Solid angles of this kind occur in descriptive geometry: they are formed by the three planes of 
projection. 

If perpendiculars are drawn to the faces of a solid angle from an interior point, another solid 
angle is formed with the point as vertex and the perpendiculars as edges. This new angle is called 
the polar angle of the original angle, which can, in turn, be regarded as the polar angle of the new 
angle (see Chapter 12.). 

Every solid angle is the polar angle of its polar angle. 

The following theorems hold for solid angles: 


I. The sum of all the edge-angles of an n-edged solid angle is less than 360°. _ 
, 2. chs = of all the face-angles of an n-edged solid angle is greater than n- 180° — 360° and less 
than n+ 180°. 


; : i edge-angle of the polar angle and the corresponding face-angle of the original angle add up 
fe ; 
5. A face-angle of the polar angle and the corresponding edge-angle of the original angle likewise 
ddd up to 180°. 
Solids 

Fundamental concepts. A solid in the sense of solid geometry is the set of all points, lines and 
planes of three-dimensional space that lie inside a complete closed part of the space, that is, inside 
the bounding surfaces of the solid, including those points, lines and planes that belong to the bounding 
surfaces. The sum of the areas of the bounding surfaces is called the surface area, and the measure 
of that part of space completely enclosed by it is called the volume of the solid. 

If a solid is bounded entirely by planes, it is called a planar body or polyhedron (Greek polys, 
many, hedron, face); for example, the cube, cuboid, prism, pyramid. The polygons that bound a 
polyhedron are called faces. The segments in which two faces come together are called edges, and 
their end-points the vertices of the solid. The angle between two half-planes that meet at an edge 
is the face-angle between the two faces. In a wider sense one speaks of edges even of a curved solid, 
bounded wholly or partly by curved surfaces, when two of its surfaces meet at an angle along a curve. 
The angle can be measured between the perpendiculars to the two tangent planes at the point 
concerned; it is agreed that these perpendiculars are to lie in the half-spaces containing the solids 
(assumed to be convex). 

If a plane is regarded as a surface of zero curvature coinciding with its tangent planes, then the 
angle between the curved surface of a right circular cylinder and its base at each point of the base 
circle is 90°; for a circular cone, this is the angle of elevation. Examples of curved surfaces without 
edges are the sphere, ellipsoid, and torus. 

The surface area of a solid can be determined in principle as the sum of the areas of the individual 
bounding surfaces. By deriving certain formulae, the process of summing the partial areas, which 
may be troublesome in practice, can sometimes be avoided. 

The volume of a solid can be determined with the help of the following tables of cubic content and 
capacity. 


8.2. Cube and cuboid 187 


Units of measurement 


Cubic content. The cubic 
metre (denoted by m/%) is the 
volume of a cube of edge-length 
1 m. Larger and smaller units of 
cubic content are derived from 
the cubic metre. 

In shipping various units for measuring content are in use. Weight is measured in long tons of 
2240 lb. The displacement tonnage is equal (by Archimedes’ principle) to the weight of water dis- 
placed by a floating vessel. In cargo vessels the lightweight displacement tonnage corresponds to the 
weight of the hull, machinery, equipment, plus the weight of the crew and their effects; the full 
displacement tonnage takes account of the additional maximum weight of bunkers and cargo, and 
the difference between the two is the deadweight tonnage. 

Harbour and canal dues are based on the volumetric ton, which measures the capacity of the en- 
closed space, with one ton equal to 100 cu. ft. or 2.83 m>. Finally, for freight rates on cargo the 
freight ton corresponds to only 40 cu. ft. or 1.13 m? 


Capacity. Whereas in scientific measurements — is assigned by the units of cubic content 
in common life it is assigned by /itres. The litre, denoted by 1, is defined by 1 1 = 1 dm*. The multiple 
1 hl = 1001 and the parts of one litre 1 cl = 0.01 1 and 1 ml = 0.001 1 are also in use. 

In the Anglo-Saxon countries cubic contents and capacities are still based on the cubic yard 
(see table). 


1 cubic | ft.> in.? 1 gallon | 1 gallon 1 dm? 1 m3 
yard (Imp.) (USA) 
1 cubic yard 1 27 46 656 168.2 140.17 764.553 0.765 
1 cubic foot 0.037 1 1728 6.229 5.191 28.32 0.028 
1 cubic inch 0.000 02 0.00058 1 0.003 6 0.004329 0.0164] 0.00016 
1 gallon 0.059 45 0.1605 277.42 1 6/5 4.546 | 0.00455 
(Imperial) 
1 gallon 0.049 54 0.1337 231 5/6 1 3.788 0.003 79 
(USA) 
11= 1 dm? 0.001 31 0.0353 61.02 0.22 0.183 1 0.001 
1 m3 1.308 35.314 61020 220 183 1000 1 


8.2. Cube and cuboid 


Cubes and cuboids are polyhedra. The cube has eight rectangular solid angles, twelve edges of 
equal length and is bounded by six equal squares. 


The cuboid has, like the cube, eight rectangular solid angles, 2S 2 
and twelve edges, equal and parallel in fours. It is bounded by | 
three pairs of congruent rectangles lying in parallel planes. = F | | 
The cube is a special case of the cuboid (Fig.). , : 


8.2-1 The cube is a special case 
Surface area of a cuboid . 

If a model of the surface of a polyhedron is cut along suffi- 
ciently many edges, it can be placed in one plane to form a connected system of bounding surfaces. 
This is called a net of the polyhedron. 

Conversely, the net of the polyhedron can be bent along certain edges and st uck together to form 
the model of the polyhedron. 

The net of the cube consists of a 
connected system of six equal squares 
(Fig.). There are different ways of 
arranging the squares, one of which is 
shown in the figure. 

The net of the cuboid consists of a 
connected system of three pairs of con- 
gruent rectangles. Here too there are 


8.2-2 Two nets of a cube 


188 8. Solid geometry 


different ways of arranging them. The figure 
shows two cuboids; the right-hand one has 
one pair of square faces; the remaining four 
faces are then congruent. 

If the lengths of the edges of the cuboid are 
a, b, c, then the areas of the three rectangles 
are ab, ac and bc, so that the surface area S 
is pat by 


= 2ab + 2ac + 2bc = ae +ac+ be). 


A cuboid with one pair of square faces (a = c) has surface area S = 2a? + 4ab. Finally, for a 
cube, where a = 6 = c, S = 6a”. 


Volume 


In plane geometry the measure of the area of a figure, for example, a square or rectangle, is 
defined by covering the figure with unit squares. Similarly, the measure of a volume, for example, 
of a cube or a cuboid, can be defined by filling the space with unit cubes. 

The volume on a cube with edge-length a (say 10 units) can be completely filled in this way (Fig.). 


There are a (= 10) Jayers, each having a (= 10) 
rows of a (= 10) unit cubes, and so altogether 
a-a:a=a? (10 X 10 X 10 = 1000) units cubes. 


The volume of a cuboid with edge-lengths a, b, c 
can be filled by c layers each containing 5b rows 
of a unit cubes. The volume therefore amounts to 
a:b:+c unit cubes (Fig.). 

For the special case of a cuboid with one pair 
of square faces, V = ab. 


678917 


S 


B.2-6 Volume of a 


8.2-5 Volume of the decimetre cube cuboid 


Example: What is the volume of a brick of average size 22.0 x 11.6 x 7.0cm? — Here a 
= 22.0cm, 6 = 11.6cm, c = 7.0cm, so V = abe = 22cm X 11.6cm X 7.0cm = 1786.4 cm?. 


8.2. Cube and cuboid 189 


These formulae also hold if the length of one or more edges is not an integral multiple of the 
edge-length e of the unit cube. If the edge-lengths a, b, c are rational multiples of e, for example, 
a = (p,;/q:1) e, b = (p2/q2) e, c = (p3/q3) e, the argument holds for a smaller unit cube whose 
edge-length e’ is the kth part of e, where k is the least common multiple of q;, ¢2, q3. An irrational 
multiple of e can be approximated arbitrarily closely by a sequence of rational numbers (see 
Chapter 3.). In general, the calculation of the volume of a solid whose surface can be defined ma- 
thematically is a topic of the integral calculus. 


Special relations 

Diagonals of a cuboid. One distinguishes between face-diagonals and space-diagonals, according 
as the two non-neighbouring vertices that are joined by the diagonal lie in a face of the solid or not. 
The cuboid has 12 face-diagonals, four each of equal length, and four space-diagonals all of equal 
length. 

The lengths of all the diagonals can be calculated as the hypotenuses of right-angled triangles 
by means of the theorem of Pythagoras (Fig.). If a, b, c are the three edge-lengths, then the lengths 
fi, f2, Ff; of the face-diagonals are given by 


fi=V@+ 6), fp=Va?+c*), fr = Vb? + c?). 


8.2-7 Length of the space diagonals of a cuboid 
8.2-8 Three pairs of diagonal planes of a cuboid 


The length d of the space-diagonals can be calculated as the hypotenuse of right-angled triangle 
whose sides are a face-diagonal and the third edge: 
d = V(f2 + €7) = VUE + 87) = VFB + a?) = Va? + B? +c). 

The four space-diagonals of the cuboid form six diagonal planes, which cut the cuboid in rect- 
angles (Fig.). These rectangles are congruent in pairs and are bounded by the face-diagonals and 
edges of the cuboid. Their areas D,, D2, D3 are given by: 

Di =chi=c V(a? + 67); D,=bf,=6b V(a? +c?); D,;=af,;=a V(b? + c?). 


; 
f =] ; 


-/} ; 


Diagonals of a cube. Since a = b = c, the face-diagonals, space-diagonals and area of the diagonal 
sections are given by fi, = fp = fs = f= V(a? +. a?) =a y2; d= y(a? + a? + a*) =a y3 and 
D, = D, = D3; = D= af=a (a? + a’) = a? y2. 


The edge-length a, the face-diagonal f and the space- 
diagonal d are in the ratios 
a:f:d=a:ay2:aY3=Yy1: y2: V3. =< 


Centre. The cube and cuboid both have the property that all the space-diagonals meet in a point C 
and are bisected there. C is called the centre of the solid and is also the centre of gravity of a uniform 
solid of the appropriate type. Finally, C is the common centre of a cube and the circumscribed and 
inscribed spheres. The radius of the circumscribed sphere is half the length 
of a space-diagonal, that is, r = !/,a@ //3; the radius of the inscribed sphere 
is equal to half the length of an edge, that is g@ = a/2. 

Section through a cube. A plane section of a cube can be arranged so that 
it is a regular hexagon. Then the centre C of the cube coincides with the 
centre of the hexagon and the vertices of the hexagon with the midpoints of 
six edges of the cube, which can be described in one circuit and are such 
that no three lie in one plane (Fig.). 


8.2-9 Regular hexagon as plane section of a cube 


190 8. Solid geometry 


This regular hexagon consists of six equal equilateral triangles with side-lengths s = '/,a //2 
(half a face-diagonal of the cube) and area A = '/,s? 3. The area S of the hexagon is given by 
S = 6A = 6 X 1/487 y3 = 3/48? y3 = 3/,a7 V3. 


8.3. Prism and cylinder 


General 


Prism. If a line moves in space, without altering its direction, along the perimeter of a plane 
n-gon (n = 3, 4, ...), then it describes a prismatic surface; if it passes through a vertex of the n-gon, 
it is an edge of this surface. 

An n-gon can be interpreted as the section of the prismatic surface by a plane that cuts all its 
edges. If a second plane, parallel to the first, cuts the prismatic surface, then the section is a second 
n-gon congruent to the first; the two sections and the prismatic surface completely enclose a part 
of space. This solid is called a prism (greek, sawn off), the two m-gons are its bases, or base and top, 
and the part of the prismatic surface belonging to the prism is called its surface. The segments of 
the edges of the prismatic surface that join corresponding vertices of the bases are called side-edges, 
to distinguish them from the base-edges, which correspond to the sides of the bases. An n-sided 
prism has n side-edges and 2n base-edges, and therefore 37 edges altogether. All the side-edges are 
of equal length and two corresponding base-edges are parallel and of equal length. The lines inside 
the side-faces parallel to the side-edges are called generators of the prism. The height of the prism 
is the distance between the planes of the base and the top. 

If one of the side-edges is perpendicular to one of the bases, then all the side-edges are per- 
pendicular to both bases. Such a prism is called righi, and all others: are oblique , 

The side-faces of a right prism are rectangles. If the bases of a right prism 
are regular n-gons, the prism is also called regular. The side-faces are then 
congruent rectangles. The line joining the centres of the bases of a regular 
prism (the points where the perpendicular bisectors of the sides meet) is 
called the axis (axis of rotation), and a section of the prism by a plane con- ; 
taining the axis is called an axial section. A skew 4-sided prism with a 38-3! ake 
parallelogram as base (Fig.) is called a parallelepiped. parallelepipe 


Cylinder. If a line, the generator, moves in space, without altering its direction, along a curve, 
the guide curve, then it describes a cylindrical surface. A cylinder is a solid bounded by a cylindrical 
surface with a closed guide curve and two planes that are parallel to each other but not to the 
generators. The segments of the generators of the cylindrical surface between the parallel planes are 
also called the generators of the cylinder, and they are of equal length. The part of the cylindrical 
surface between the parallel planes is called the curved surface of the cylinder. The base and the 
top, which are cut on the cylindrical surface by the parallel planes, are congruent. Their perpendicular 
distance is the height of the cylinder. Every cylinder has at least two edges in the wider sense, the 
boundary of the base and the top. If at each point of these edges the angle between the base or the 
top and the curved surface is 90°, the cylinder is called right, otherwise oblique. 

According to the type of base there are different types of cylinder. In particular, if the base is a 
circle, one speaks of a circular cylinder. A right circular cylinder is also called a cylinder of rotation. 

If one thinks of the cylinder as made of solid material, and a smaller cylinder as bored out in 
such a way that the bases of the cylinders are concentric circles, then the remainder is a hollow 
cylinder. Hollow cylinders are often used in technology, for example as gas-holders, boilers, petrol- 
tanks, tar-barrels, and so on. Pipes are very long hollow cylinders; they are used, for example, to 
transport gases (natural gas or steam) or liquids (water or petrol). 


Surface area 


4Prism. From a model of the surface of a prism 
one can obtain its net (or conversely, the model 
can be built from the net), just as for a cuboid 
and a cube. The figure shows the net of a 6-sided 
regular prism. 


Cylinder. A model of the surface of a cylinder 
can be cut along a generator and both bases. The i, : 
curved surface of a right circular cylinder, for 8-3-2 Net of a six-sided regular prism 
example, can be developed into a plane, as the figure 
shows in three steps. Its net consists of a rectangle with the generator s as height A and the circum- 
ference 2zr of the base as side, and the two circular bases (Fig.). The developed surface of an oblique 
circular cylinder cut by two parallel planes is bounded by two parallel lines and two sine curves 


8.3. Prism and cylinder 191 


8.3-3 Curved surface of a right circular cylinder 


8.3-4 Net ofa right circular cylinder 


of the same phase and amplitude (Fig.). In principle, every cylindrical surface is developable. In 
practice, the development even of the right circular cylinder assumes the rectification of the circum- 
ference of a circle. An approximation to within 0.002% is given by the construction of A. KOCHAN- 
SKY (1685), which is easily carried out by ruler and compass; it gives 7 ~ Y/(131/,; — 2 3) (Fig.). 


8.3-6 Kochansky’s approximate con- 
8.3-5 Oblique sections of a right circular cylinder struction for the circumference of a circle 


The surface area S of any prism or cylinder can be 
obtained by adding the area C of the side-faces or curved 
surface and the areas (each B) of the bases. The formula 
can be specialized as required. 


Example 1: To find the surface area of a regular 6-sided prism with base-edge a = 3u and height 
A = 4u, where wu is the unit of length, — } 

B = 3/,qa? 3 and C = 6ah, and so S = 2 x 3/2a? V3 + 6ah = 3a? 3 + 6ah = 3a(a V3 + 2A). 
By substituting the given values one obtains S~ 119u* (square units, say square inches). 

Example 2: For the surface area of a steel bolt of circular cross-section with diameter d = 50u 
and height h = 60u, one obtains B=2d?/4 and C=2dh, and so S = 2nd?/4 + 2dh 
= nd(d/2 + h) and, by substituting the given values, S = 42502u? ~ 13352u?. 


Cavalieri’s principle 


To calculate the volume of a prism or cylinder one uses a principle published in 1629 by CAva- 
LIERI. a pupil of GALILEI. 

Cavalieri’s principle: Solids with the same 
height and with cross-sections of equal area have 
the same volume; in particular, prisms or cylin- 
ders with equal bases and heights have the same 
volume. 

The theorem can be made plausible by elementary 
methods, by constructing a solid from prismatic 
sheets of very small height and then, by displacing 
the sheets, giving it a different form with evidently ; Seacligin te aeee 
the same volume (Fig.). The bases of the sheets are 8.3-7 Illustration of Cavalieri’s principle 
the cross-sections, and therefore area at the same 
height. The smaller the height of the sheets, the more nearly the surface of the ‘staircase’ approaches 
the form of a surface described by a continuous function; for example, a pile of equal circular 
sheets of very thin paper can represent an oblique circular cylinder very closely (Fig.). By means 
of limiting arguments the theorem can be proved by the methods of the integral calculus. 


192 8. Solid geometry 


If the base of a prism of cylinder is denoted by B 
and the height by h, then the volume of any prismatic or 
cylindrical solid is } given by V = BA. 


For example, the base of a regular triangular prism 
with base-edge a and height h is given by B = 1/,a? V3 
and the volume is given by V= 1/,a7h V3. For a 
regular iach prism, 


= 2 ° _ 2 ° =. 
(Sa*/4) cot 36° and V = (Sa*h/4) cot 36°. 8.3-8 Calculation of volume by Cavalieri’s 


amar [Poa] rane 


The base and top of a hollow cylinder (Fig.) are congruent circular rings 
of area B = ar? — ar2; the inner curved surface C; and the outer curved 
surface Cy are developed into rectangles of area C; = 2zr2h and 
Co = 2nr,h; the surface area S is therefore given by S = 2B+ Cy + C; 
= 2n(r, + ro) (ri — r2) + 2aryh + 2ar2h = ies, + r2)(r1 — r2 + A). 


The volume of a hollow cylinder is obtained by taking the difference 
between the volume V, of the outer cylinder and the volume V, of the 
inner cylinder: V = Vo — V, = B,h — Bah = Bh. 


8.4. Pyramid and cone 


General 


Pyramid. If a ray emanating from a fixed point Z of space moves round the perimeter of a plane 
n-gon (n = 3, 4, ...) whose plane does not pass through Z, then the ray describes a pyramidal 
surface. The rays through the vertices of the n-gon are the edges of the pyramidal surface. The 
n-gon and the part of the pyramidal surface between it and the point Z enclose a completely bounded 
space; this geometric solid is called a pyramid (Fig.). 

The n-gon is called the base, the point Z the vertex, and the part of the pyramidal surface belonging 
to the solid is called the surface of the pyramid. The segments of the edges of the pyramidal surface 
that lie between the vertices of the base and the vertex Z are called the side-edges of the pyramid, 
to distinguish them from the base-edges, which correspond to the sides of the base. An n-sided pyramid 
has n side-edges and n base-edges and therefore 2 edges altogether, and n triangles as side-faces. 
The line-segments in the side-faces that join any point of a base-edge to the vertex are called generators 
of the pyramid. 

The height of a pyramid is understood to be the distance from the vertex to the plane of the base, 
measured along the perpendicular. This perpendicular meets the plane of the base in the foot Z’ of 
the altitude. This point, and therefore the altitude, may lie outside the base of the pyramid (Fig.) 


8.3-9 Hollow cylinder 


8.4-1 Pyramid 8.4-2 Right and oblique square pyramids 


8.4. Pyramid and cone 193 


The base of a regular pyramid is a regular n-gon; if the foot of the altitude coincides with the centre 
of the base, the pyramid is called right, otherwise oblique. The side-faces of a right regular pyramid 
are congruent isosceles triangles. The height of a right pyramid is its axis, and any section by a plane 
containing the axis is an axial section. The (regular) tetrahedron is a pyramid whose base- and side- 
faces are equilateral triangles. 

The famous tombs of the ancient Egyptian kings are right square pyramids; the best-known are 
the pyramids in the southern outskirts of Cairo near Giza. The largest pyramid has a base-edge 
of approximately 227 m and a height of approximately 137 m. 


Cone. A line (generator) that passes through a fixed point Z of space and moves along a curve, 
the guide-curve, describes a conical surface (Fig.). A cone is a solid bounded by a conical surface 
with closed guide-curve and a plane that does not pass through Z. The conical surface cuts the plane 
in the base of the cone. The point Z is called the vertex of the cone and its distance from the base is 
the height. The curved surface of the cone is the part of the conical surface between the vertex Z and 
the base. The parts of the generators of the conical surface that belong to the cone are called the 
generators of the cone. One speaks of a double cone if a conical surface with closed guide-curve 
is cut by two parallel planes on opposite sides of Z. The two bases are similar, and the sum of the 
heights is the distance between the parallel 
planes. According to the type of base one 
distinguishes the circular cone, elliptic cone and 
other types of cone. If the base has a centre Z’ 
and if the line ZZ’ is perpendicular to the 
base, the cone is called right, otherwise oblique. 
A right cone for which all plane sections 
through ZZ’ are congruent can be regarded 
as a solid of rotation, for example, the right 
circular cone by rotation of a right-angled 
triangle about one of the sides of the right 
angle (Fig.). 


7 


8.4-3 Conical surface and double cone 8.4-4 Right circular cone 


Examples of conical shapes in technology are the roof of a tower, parts of containers (for example 
the lower part of a cement silo), parts of machinery, conical vawes and conical couplings in powered 
vehicles. However, conical shapes also occur in nature: many mountains of volcanic origin have the 
shape of a cone, as well as a heap formed by slowly running sand or soil. 


Surface area 

The net of a pyramid can be obtained from a model of its surface 
by cutting it along one side-edge and all but one of the base-edges 
or along all the side-edges and opening out the side-faces into the 
plane of the base. The figure shows the net of a square pyramid. 

The model of the surface of a cone has to be cut along one gene- 
rator and the base-edge. For example, the curved surface of a right 
circular cone can be developed into a plane as for a right circular 
cylinder. It consists of a circular sector (Fig.). The net of a right cir- 
cular cone consists of the circular sector, whose radius @ is the length 
s of a generator and whose arc b is the length 2zr of the circumference 
of the base, and the circular base of the cone itself. If, C is the area 
of the curved surface, then C : 70? = b: 220, C = (bo”)/(20) = (Qursy/2—_8.4-5 Net of a square 
= rs. If h is the height of the cone, then s = V(r? + h?) pyramid 


194 8. Solid geometry 


ince ac oan pram [esa ie cone see oa CURSE | OT] 
or cone is denoted by S, that of the 
base by B, and that of the side-faces or curved surface by C, then S = B + C. 


According to the type of solid, these formulae can be 2 rasp For the surface area of a regular 
tetrahedron with edge a, C = 3B, and since B = 1/,a? V3, S= 4B = 1/4 - 4a? V3 = a? V3. 


8.4-6 Developing the curved surface 
of a right circular cone 8.4-7 Cusanus’ construction of corresponding circular arcs 


For the problem of developing the arc s of a circle of radius r and angle gy subtended at the centre 
onto a circle of radius 9, the construction of Nicolaus Cusanus (1401-1464) gives a good graphical 
approximation if the angles gy and y subtended by the two arcs are less than 45° (Fig.). 


Volume 
In order to apply Cavalieri’s principle to a pyramid, a plane section of the pyramid is taken 
parallel to the base. The section is similar to the base. Its distance h’ from the vertex is less than the 
height / of the pyramid. All corresponding parallel segments of this séction and the base are in the 
ratio of the heights, s’: s = h’: h, and the areas A’ and A are in the ratio A’: A = h’?: h?. 


The areas of the base of a pyramid and any section parallel to the base are in the ratio of the squares 
of the corresponding distances from the vertex (heights). 


By Cavalieri’s principle this implies: 
Pyramids with equal bases and equal heights have equal volumes. 


Since the base can be transformed into a triangle of the same area, or split up into triangles, it is 
sufficient to calculate the volume of a triangular pyramid. 


The volume of a triangular pyramid is one third of the volume of a prism with the same base and 
height. 


As the figure shows, the triangular prism can be divided by two plane sections into three pyramids 
V,, V2, V3 of equal volume. V, and V2 have the base and top of the prism as bases, A DEF = AABC, 
and the height of the prism as height, |BE| = |CF|. However, since A ACF = AAFD, V> and 
V; have equal bases, and for each of them the distance of the point B from the side-face ACFD is 
the height. Hence, if B is the base and & the height of a pyramid, then the volume is V = Bh/3. 


Cavalieri’s principle is concerned only with 
the equality of area of parallel sections, and not 
with the form of these sections. A cone can be 
regarded as a special pyramid with base B = ar? 
and so: 


Cones with equal bases and equal heights 
have equal volumes. 

The volume of a cone is one third of the 
volume of a cylinder with equal base and height. 


[Volume ofa cone [VS arh/3Sd7H/12] 4.4.8 Spliting triangular prism into three trian- 


gular pyramids 
Frustum of a pyramid and cone 
A frustum of a pyramid is a solid bounded by planes, whose base and top are parallel and whose 
side-edges meet in a point outside the solid (Fig.). A frustum « of a pyramid can always be made into 
a pyramid by adding another pyramid 
on top. 


8.4. Pyramid and cone 195 


8.4-9 Frustum of a pyramid 8.4-10 Frustum of a cone 


If B, is the base and h, the height of the completed pyramid, and B, the base and h, the height 
of the completing pyramid, then 4 = h, — hz is the height and B, and B, the areas of the base and 
top of the frustum. This is called oblique, right, or regular according as the completing pyramid 
is oblique, right, or regular. Its surface area S is composed of the bases B, and B, and the side- 
surface C, which consists of m trapezia, if B, is an n-gon. The frustum therefore has 2” base-edges 
and n side-edges. A right square frustum of a pyramid with base-edges a and 5b has isosceles trapezia 
as side-faces; its height is 1 = [(h? + (a — b)?/4], and so its surface area is given by 
S = 4[(a + b)/2]n = a? + b? + 2nfa + B). 

In the same way a frustum of a cone can be obtained from a cone by a section parallel to the base 
(Fig.). Its curved surface C can be developed. If r, is the radius of the base-circle, h, the height and 
s, the length of generator of a right circular cone and r,, hz and s2 the corresponding lengths of 
the cone that is cut off, then for the frustum that is left the height is A = h, — hz and the length 
of a generator is s = s; — sz. In an axial section, r, : rz = 5s, : Sz and so (ry — r2): ry = 5:5, and 
(r; — r2)ir2 = S:S82. Therefore s; = sr,/(ry —r2) and s2=—sr2f(r; —r2), hence C=a2syry 
— N$2r, = ns(r,; + r2) and S = ar? + ar2 + as(r, + r2) for the surface area of a right frustum 
of a cone. 


Many articles in daily use are like a frustum of a pyramid, for example a laundry basket, a baking 
tin and a wheelbarrow; the form of a frustum of a cone is illustrated by a lampshade, a flower vase 


and a drinking glass. 


Volumes. If B, and fh; are the base and height of a completed pyramid and B, and hz the base 
and height of the completing pyramid, then the volume of the frustum of a pyramid is given by 
V = (1/3) (BA, — B2h2). Since the areas of parallel sections are proportional to the squares of 
their distances from the vertex, h,:h. = VB,:VB2. Hence hi = hYB,/(VB, — VB2) and 
h, = h B2/(VB, — VB2). The volume V of a frustum of a pyramid is therefore given by 

yah, BV: — B VB, _ ih Bi ~ Ba V(b,B2) + By V(B1B2) — B2 
3 VB, = VB, 3 B, ~ B, 


= $B, + V(B.Bs) + Br). 


A corresponding relation holds for a frustum of a cone with B, = ar? and B, = ar2. 


In practice, sufficiently accurate results can often be obtained from the approximate formulae. 
These results are the more accurate the more the form of the frustum of a pyramid approximates 
to that of a prism (B, ~ B,) or that of the frustum of a cone approximates to that of a cylinder 
(r, & rz). The first two approximation formulae always give results that are too large, and the other 


too small. 


196 8. Solid geometry 


8.5. Polyhedra 


Euler’s polyhedron theorem 
If a solid is bounded by planar faces only, it is called a planar solid or polyhedron; cubes, cuboids, 
prisms, pyramids and frustums of pyramids are polyhedra. 
A polyhedron is called convex or an Euler polyhedron if the line-segment joining any two of its 
points contains only points inside the polyhedron. Euler’s theorem for polyhedra was probably 
known to ARCHIMEDES and certainly to DESCARTES. 


Euler’s polyhedron theorem: If v is the number of vertices, f the number of faces and e the number 
of edges of a convex polyhedron, then v + f — e = 2. 


To determine the number E = v + f— e in Euler’s theorem, one imagines a model of the poly- 
hedron covered with a rubber sheet of which one face has been cut out. The number 9 of faces 
remaining is f— 1,sothat E = v + 9+ 1 — e. If the remaining surface is spread out into one plane, 
the edge-lengths and angles are altered, but not v, gy and e. In the Schlegel diagram of the polyhedron 
so obtained (Fig.) each of the gy faces can be divided into triangles by means of diagonals. Each 
diagonal increases e and 9 by 1, that is, E remains constant. If then one edge belonging to exactly 

one triangle is removed from the 


boundary, then e and 9 are decreased 

| by 1, and again E remains constant. 
If one edge and one of its vertices, 

| | which no longer belong to any face, 

are removed, then v and e are decre- 

ased by 1 and E remains constant. By 

| ‘\\. repeatedly applying these steps, one 


triangle is finally left, for which v = 3, 
8.5-1 Plane net of a cube to prove Euler’s polyhedron theorem e = 3 and p= 1,s0 that E=v+ 
+ep+l—e=2., Hence v + f—e 
Regular polyhedra = 2 1s generally true. 
The five regular polyhedra. A convex polyhedron is called regular if it is bounded by regular 
congruent polygons and the same number of edges meet at each vertex. The five solids of this kind 
(Fig.) are also called Platonic, after PLATO. 


8.5-2 The five regular bodies and their nets: a) tetrahedron; b) hexahedron (cube); c) octahedron; d) icosa- 
hedron; e) dodecahedron 


8.5. Polyhedra 197 


By the theorem that the sum of the edge-angles at a vertex is less than 360°, there can only be 
five regular solids. 1. If the polyhedron is bounded by equilateral triangles, then, since the edge- 
angles are 60°, only three, four or five faces can meet at a vertex; if there were six faces, the sum of 
the edge-angles would be 6- 60° = 360°. 2. If the polyhedron is bounded by squares (each edge- 
angle 90°) or 3. by regular pentagons (each edge-angle 108°), then only three faces can meet at a 
vertex. 

Regular hexagons (each edge-angle 120°) are impossible, since 3 - 120° is not less than 360°. 

These arguments lead to five possible types of regular solid, and the numbers v of vertices, f of 
faces and e of edges are summarized in the following table, n denoting the number of faces at a 
vertex: 


Bounding faces Regular solid 


equilateral triangles tetrahedron 


equilateral triangles octahedron 
equilateral triangles icosahedron 
squares hexahedron or cube 
regular pentagons dodecahedron 


The inscribed and circumscribed spheres are an important characteristic of this class of solids, 
since the centre of a regular polyhedron is also the common centre of these spheres. The surface 
of the circumscribed sphere passes through the vertices of the polyhedron, and the surface of the 
inscribed sphere touches each face at its centre. Hence: the perpendiculars to the faces at their 
centres meet at the centre of the polyhedron. 

If n is the number of sides of a face, m the number of edges that meet at a vertex, v the number 
of vertices of the polyhedron, f the number of faces and e the number of edges, then if a is the length 
of an edge, S the surface area and V the volume, the following results are obtained: 


Regular solid 


tetrahedron A 

hexahedron i 

octahedron 3 <4 6-— 8} 12 2a* V3 | tsa? y2 
dodecahedron SS 3 20 — 12| 30 | 3a? y[5(5 + 2 )/5)] | 2/3a°(15 + 7 5) 
icosahedron 3 5 12 20 | 30 5a* \/3 | 3/,2a9(3 + /5) 


Duality. The crosses in the table mean that the solids connected in pairs are dual. The numbers 
of vertices and faces are interchanged; the figure shows the example of the cube and octahedron. 
The number of edges remains the same, by Euler’s theorem: v, + f; = fo + v2 =e +2. 

The tetrahedron is self-dual. 


8.5-3 Duality between 
| cube and octahedron 


8.5-4 Truacated polyhedra. 
The tetrahedron becomes an 
octahedron (left) and the 
cube gives a middlecrystal 
(right) 


Truncated polyhedra. If the vertices of a regular polyhedron are cut off in such a way that the 
plane sections are regular and congruent, then the remaining solid is again a regular polyhedron 
or a semiregular (Archimedean) solid, according as all the faces of the truncated solid are congruent, 
or regular n-gons with different numbers of vertices meet at each vertex of the solid (Fig.). The 
truncated cube in the figure is called a middle crystal, since it can be generated equally well from 
an octahedron by taking sections through the mid-points of the edges. However, a cube can be so 
truncated that a regular octagon is obtained on each face. 


Crystals 


While most solids that occur naturally are irregular, crystals arise directly as mathematical solids. 
Niels STENSEN (1638-1686) discovered the Jaw of constancy of angle, according to which the angle 


198 8. Solid geometry 


between corresponding faces of all crystals of the same material 
has the same value. To describe the symmetry properties of 
crystals one needs the ideas of centre of symmetry, axis of sym- 
metry, and plane of symmetry. For the cube there are three axes 
of symmetry joining the centres of opposite faces, and nine planes 
of symmetry. The three principal planes of symmetry or briefly 
principal planes pass through the centre of the cube and are 
parallel to pairs of opposite faces. Perpendicular to each of these 
are two planes of symmetry, each of which contains two space- 
diagonals of the cube (Fig.). 


8.5-5 Principal planes of the cube 


8.6. Sphere 


General 
If a semicircle is rotated about its diameter, then its circumference describes a spherical surface 
or sphere. The part of space completely enclosed by a spherical surface is called a sphere. A spherical 
surface is the locus of all points of space that have a constant distance from a fixed point of space. 
The fixed point is the centre of the sphere. The spherical form plays an important part in everyday 
life; one can think of a ball bearing, a ball-and-socket joint, a toy ball, and the celestial sphere. 


Relative position of a line or plane and a sphere. A line has 0, 1 or 2 points in common with a 
spherical surface. 


Secants, chords. A secant cuts the sphere in two points. The chord is the segment of it that contains 
no points outside the sphere. The longest chord is a diameter of the sphere; it is bisected at the 
centre C of the sphere. Every segment from C to a point of the spherical surface is a radius. 

Tangents, tangent planes. A tangent ¢ to the sphere has exactly one point in common with the 
sphere, the point of contact B of t (Fig.). The pencil of planes through ¢ cuts the sphere in circles, 
which all touch t at B. That plane which contains the centre cuts the sphere in a great circle. The 
plane of the pencil perpendicular to this is the tangent plane to the sphere with B as point of contact. 

A plane that is not a tangent plane either misses the sphere or cuts it, in general, in a small circle 
(Fig.), but a great circle if the plane passes through C. 


8.6-1 Sphere and line 


spherical digon 8.6-2 The intersection of a spherical surface and 
a plane is a circle 


Spherical cap, spherical segment. A plane that cuts the sphere divides it into two spherical segments 
and its surface into two spherical caps, which are equal if the section is a great circle (Fig.). 

Spherical zone, spherical layer. Two parallel planes meeting the sphere cut from it a spherical 
layer, and from its surface a spherical zone. One of the sections can be a great circle. Two planes 
having a diameter in common divide the sphere into four spherical wedges and the surface into four 
spherical digons. Two opposite wedges or digons are congruent. 


8.6. Sphere 199 


sector 


hemisphere 


great circle 8.6-3 The sphere and its parts 

Spherical sector. If a radius of the sphere moves along a small circle of the sphere as guide-curve, 
then it describes a conical surface and divides the sphere into two spherical sectors. If the guide- 
curve is a great circle, then the surface separating the two spherical sectors degenerates into the 
plane of the great circle, and the spherical sectors are hemispheres. 


Volume 


Volume of a sphere. By Cavalieri’s principle a hemisphere of radius r has the same volume as a 
right circular cylinder of radius r and height r from which a right circular cone with the same base- 
radius r and the same height r has been bored (Fig.). 

A plane at an arbitrary distance r,(r; <r) from alr? =r) @, «Vir? -7;4) 
the base and parallel to it cuts the hemisphere in 
a circle of radius 9, = //(r? — r?) and the remain- 
ing solid in a circular ring with radii r and r;. 
The cross-sections therefore have the areas 
A, = 20? = 2n(r? — r?) and A, = ar? — ar}, 
that is, the same area. This shows that the hemi- 
sphere has the volume Vy = ar? — ar3/3 = 22r3/3 
(Fig.). 


REY See] ee ee 


of a sphere 


Wie rerciiinrtinny 


8.6-5 The volumes of these three bodies are in the ratios 3:2: 1 


8.6-6 Derivation of the formula for the volume of a spherical segment 


Volume of parts of a sphere. Spherical segment. The formula for the volume of a spherical segment 
(Fig.) is derived by the same principle (comparison of the hemisphere with the residue of the 
cylinder). Here, instead of the cone it is a frustum of a cone: 

V= mor hy — [smh lr? + (r—hy)rt+(r— hy?) = oo — h}) = */3h}(3r — hy). 
Since 9? = r? — (r — h,)? = 2rh, — h? or 6rh, — = 397 + h?, V = 2h, (30? + h?)/6, where 
r is the radius of the sphere, h, the height of the seni and @ the Tadius of the ee 


Spherical layer. If the spherical layer between two circular sections with radii 9, and g2 has height A, 
then by Cavalieri’s principle its volume is the difference between that of a cylinder, 2r?h, and a 


200 8. Solid geometry 


frustum of a cone with base-radii r; = rz + A and r2 (Fig.). 
The volume is given by 

V = ar7h — ah{(r2 + A)? + (ro + A) ro + 73)/3 

— mh(6r? — 6r2 — 6r2h — 2h?)/6. 

Because 

ef=P—-(r +h’, 8 =r — 73, 

0? + 03 = 2r? — 2r2h — 2r3 — bh’, 

BGs ta dee =F 708 Oar 8.6-7 Derivation of the formula for 
the volume is V = 2h(30? + 303 + h?)/6. the volume of a spherical layer 


Spherical sector. The volume of a spherical sector is the sum of the volumes of a spherical segment 
and a cone: Vector = 2/37%h?(3r — h) + '/3207(r — h), where h is the height of the segment, 
r —h the height of the cone and g¢ the radius, 
of the guide-circle. Since 9? = A(2r— h), 
Veector = 2nr7h/3. 

Hollow sphere. A hollow sphere is what is left of a sphere of radius r; when a concentric sphere 
of radius r2(r; > rz) is removed. The volume of the hollow sphere is the difference between the 
volumes of the two spheres: 


— 4+ 3 4 3 
Vaotow snere = “Haart — “Iori. [Volume of a hollow sphere | V = */snte? = r3) | 


Surface area 


Surface area of a sphere. In contrast to the curved surface of a cone or cylinder, the surface of a 
sphere cannot be developed into a plane. To derive the formula for the surface area of a sphere 
limiting arguments are necessary. 

If the surface of the sphere is thought of as divided into nm small polygons, then the radii through 
the vertices of these polygons divide the spherical space into n pyramidal subspaces with bases B; 
and heights h,;. The larger n is, the smaller are the bases B;, the difference between the heights h, 
and the radius r of the sphere, the difference between the sum of the areas of the bases B; and the 
surface area S of the sphere and the difference between the sum of the volumes of the subspaces 


n n 
B,h,/3 and the volume V of the sphere. From lim >’ B, = S, limA,;=r and lim J B,h,/3=V 
one obtains V = rS/3 or S = (3V)/r = 4ar?, 27 i=l n—- 00 n> i=l 


fagreat circle ofthe sphere, [_Surface area of a sphere | S=4nr* | 
of a great circle of the sphere. | 


By analogy with the circle, the sphere has the greatest volume of all bodies with the same surface 
area or the smallest surface area of all bodies with the same volume (see the Isoperimetric problem 
in Chapter 38.). This property of the sphere is of great importance; for drops of liquid and stars, 
because of their spherical shape, the rate of evaporation and heat transfer is less than it would 
be for other forms. Also, because of this property and their large capacity, spherical containers 
are often preferred for gases and liquids. 


Surface area of parts of a sphere. To derive the formula for the surface area S of a spherical cap 
one proceeds just as for the surface area of a sphere, that is, one obtains V,..4¢, = 27r7h/3 and 
2nr7h/3 = */3rScap (Fig. 8.6-3). Hence S.4, = 2rh, where fA is the height of the corresponding 
segment. The surface area of a spherical zone can be regarded as the difference between the 
areas of two spherical caps. If A is the height of the corresponding spherical layer, and h,, hz the 
heights of the smaller and larger spherical caps, then Sjone = Scap2 — Scapi = 27rhz — 2nrh, 
= 2nr(hz — hy), and since hz — hy = h, Syone = 2nrh. — Note the formal equality of the two for- 
mulae for S.ap and S,on-, though / has different meanings in the two cases. 

The surface area of a spherical sector is the sum of the surface areas of a spherical cap and the 
curved surface of a cone: Ssector = 22rh + nor = ar(2h + e), where fA is the height of the cor- 
responding segment, @ is the base-radius of the corte and r is the radius of the sphere and the length 
of a generator of the cone. 


8.7. Further solids 201 


8.7. Further solids 


Solids of rotation. Since the invention of the potter’s wheel, solids of rotation have had many 
applications. Every plane through the axis of rotation cuts a surface of rotation in a meridian or 
profile, and every plane perpendicular to the axis cuts the surface in a parallel circle. Every surface 
of rotation can be covered by an orthogonal net of meridians and parallel circles. The surface 
normals n along a parallel circle consisting of regular points generally form a cone of rotation. 
For circles of maximum and minimum radius the cone degenerates to a plane, and for a circle along 
which a tangent plane touches the surface, the cone degenerates into a cylinder (Fig.). 


ae 


f 
parallel circle 
8.7-2 Torus with intermediate positions of the rotating 
circle 


8.7-1 Solids of rotation; normal na, the normal cone along a 
circular section is coloured yellow; along a circle in a tangent 
plane it becomes a circular cylinder, along a circle whose radius 
is a relative maximum or minimum it degenerates to a plane 


Well-known solids of rotation are the sphere, cone of rotation, cylinder of rotation, paraboloid 
of rotation, one-sheet hyperboloid of rotation, two-sheet hyperboloid of rotation, ellipsoid of 
rotation (see Chapter 24.), the torus, pseudosphere and catenoid. 

A torus is obtained by rotating a circle of radius r about an axis in the plane of the circle at a 
distance a > r from its centre (Fig.). The torus is a tubular surface. 

The pseudosphere (see Table 56) is obtained by rotating the tractrix about its asymptote. If the 
x-axis of a Cartesian coordinate system is taken as the asymptote, and if @ is the distance of the 
cusp A along the y-axis, then V = 27a>/3 is the volume of the pseudosphere. At each regular point, 
the surface has constant negative Gaussian curvature. Because of this property the pseudosphere 
serves as a model for non-Euclidean hyperbolic geometry, just as a sphere for non-Euclidean elliptic 
geometry. 

The centres of curvature of the tractrix lie on a catenary, which is therefore the evolute of the trac- 
trix (see Fig. 19.5-11). Rotation of the catenary about its directrix gives the catenoid, which is the 
only real minimal surface of rotation. 


Pappus’ rules. To calculate the volume and surface area of solids of rotation PAPPus of Alexandria 
(end of the 3rd century A. D.) gave rules, which are derived nowadays by means of integral calculus. 


Pappus’ rule for surface area. If a plane curve C is rotated about a line / in its plane such that C 
lies on one side of /, then the area S of the resulting surface of rotation is equal to the product of the 
length of the generating curve C and the length of the path of the centre of gravity of C under the 
rotation. 

Pappus’ rule for volume. If part of a plane 4 is rotated about a line in the plane that has at most 
boundary points in common with A, then the volume V of the resulting solid of rotation is equal to the 
product of the area of A and the length of the path of the centre of gravity of A under the rotation. 


Example 1: For a torus (Fig. 8.7-2) the surface area S is given by S = 2ma-2nr = 4n*ar 
and the volume by V = 2za - ar? = 2n7ar?. 

Example 2: By rotating a semicircle about its diameter one obtains the well-known values for 
the surface area and volume of a sphere, and so the distance o- and p, of the centres of gravity 
of the semicircular arc and the semicircular disc from the axis can be calculated. From S = 4zr? 
= s* 220, with s = zr one obtains 4ar? = 27?ro, and so 9¢ = 2r/m. From V = 4ar?/3 = 229,A 
with A = 2ar?/2 one obtains 4ar3/3 = 2*r*0, and so 9, = 4r/(3z). 


202 8. Solid geometry 


Kepler’s rule, Simpson’s rule. Certain approximate formulae for the volume of a solid are very 
useful in practice. In many special cases the formulae give the exact values. 

In a large work on the solid geometry of a barrel, KEPLER (1571-1630) gave an approximate 
formula to determine the volume V of a barrel, where By, B2, B; are the areas of the top and bot- 
tom surface and the section half way between them and h is the height of the barrel. 

This formula gives the exact value for the frustum of a pyramid including a pyramid, sphere, 
elliptic paraboloid, hyperboloid of one sheet, , 
ellipsoid and all layers of these bodies obtained 
by taking sections perpendicular to the axis. 


Example 1: The planes zo = ¢, 22 
= —c,z, = 0 cut the one-sheet Ayper- 
boloid (x?/a?) + (y?/b*) — (z?/c?) = 1 
in sections of area By = B, = 2mab and 
B, = nab. The solid bounded by Bp, Bz 
and the hyperboloid (Fig.) has height 2c 
and volume V = (8/3) mabe. 

Example 2: On the paraboloid of rota- 
tion z = x? + y? the planes zo = 1 and 
z, =9 cut a layer for which By = 2, B 
= 9n, B, = 5x and A= 8, so the volume 
is V = 40x. 

Example 3: For a_ tetrahedron of 
edge-length a (Fig.), Bo = Bz = 0, 
B, =a?/4 and h* = h? —a*/4 and 
so h = a//2. The volume V is given by 
Kepler’s rule: V = a® /2/12. 


8.7-3 Hyperboloid 


8.7-4 Tetrahedron 


Since Kepler’s rule gives exact results for pyramids and tetrahedra, it can be applied without 
error to the prismoids. These provide good approximations for barrels, cask-shaped solids and 
tree-trunks that are not too long. It breaks down for solids of rotation whose meridian curve has 
discontinuities in the direction of the tangent and for solids whose height is large in comparison 
with the mean diameter. In critical cases greater accuracy can be obtained at the cost of more 
measurement by dividing the height A into n = 2k equal parts and applying Kepler’s rule to the k 
arising pairs of layers. If B, is the area of the ith section, the volume is obtained from a rule named 
after Simpson (1710-1761). 


Conoids. In technical applications conoids are 
of practical importance. In general, they are gener- 
ated in the following way. Given a guide-curve c,a 
guide-line land a direction plane not parallel to /, the 
conoid is formed from the set of lines that meet c 
and / and are parallel to the plane. If the guide- 
curve is a circle whose plane does not contain the 
guide-line, one speaks of a circular conoid. For a 
right circular conoid the guide-line is perpendicular 
to the direction plane and is cut at right angles by 
the axis of the circle outside the plane of the circle. 
A plane parallel to the plane of the circle cuts the 
conoid in an ellipse (Fig.). Kepler’s rule, applied to 
a right circular conoid, gives the exact volume. If r 
is the radius of the base-circle and fh the height, 
then Bo = ar?, 4B, = 22r?, B, = 0 and V=a2r7h/2. 


8.7-5 Conoid 


Prismoids. A prismoid is a polyhedron with two parallel polygons as top and bottom faces and 
triangles or trapezia as side-faces. A prism, pyramid and frustum of a pyramid are special forms 
of a prismoid. 


9.1. Mappings in descriptive geometry 203 


A further special form is the wedge, 
where the top surface has reduced to 
a line parallel to the base, called the 
knife edge. A plane parallel to the base 
of a right wedge cuts off another wedge, 
leaving a pontoon (Fig.). The side-tra- 
pezia of this pontoon are congruent in 
pairs. There is no Similarity between 
the base and top face. If the height of 
a pontoon is very much greater than 
the sides of the base and top, the solid 
is called an obelisk. 

The application of Kepler’s rule to 
prismoids gives exact values. For a 
pontoon with a and 5D as sides of the 
base, c and d as sides of the top and A as 
ia 4 a d : a oe ae 8.7-6 Prismoid, pontoon, wedge 
ad + bc]/6. For d= 0 the solid is a 
wedge with edge-length c = c, and volume V = h,6(2a + c,)/6. In technology the wedge acts asa 
splitting tool and machine element, for example as a stop. Pontoons are well known as floating 
elements of transportable bridges and floating docks. Obelisks occur as stone monuments and 
religious symbols. Milestones often take these forms. Finally, various forms of roofs can be 
comprised under the general title of prismoid. 


9. Descriptive geometry 


9.1. | Mappings in descriptive geometry.. 203 9.3. Further mappings ............... 212 
Central projection .......00cceue 203 Projection with heights — the one- 
Parallel projection ...........000- 204 plane method ........0 ccc cece eee 212 

9.2. The two-plane method ........... 205 AXONOMEILY oo. c ccc ccc wee ee eees 214 
Representation of lines and planes .. 206 Central perspective ........0e005. 216 
Perspective affinity .........0044. 208 Stereoscopic image pairs .......... 220 
Side elevations, rotations, and 
representations of solids .......... 210 


Descriptive geometry investigates and applies mappings of three-dimensional space onto a plane 
drawing-board. In order to carry over the constructive methods of plane geometry one gives prefer- 
ence to mappings that make lines in space correspond to lines on the drawing-board. In the choice 
of mappings two requirements have priority: perspicuity and preservation of measurements. 

Clear images are given, for example, by central projection, because the image on the drawing-board 
imitates what is seen by the eye. The most usual way of preserving the measurements is to use normal 
projection. Since one dimension is lost in mapping spatial objects onto a plane drawing-board, 
preservation of proportions is possible only under certain restrictions. Norm-preserving axonometric 
images give a clear representation of a spatial object by reconstructing the proportions. Among 
these the frontal axonometric images are generally preferred because of their simplicity. 

If technical drawings and constructions are to be an additional means of communication, side-by- 
side with speech and writing, they must be produced in accordance with certain conventions introduced 
in descriptive geometry. The working out of these conventions suitable for practical needs goes 
back essentially to MONGE (1746-1818), who because of his famous work ‘Géométrie Descriptive’ 
and his teaching and research in the subject became the founder of descriptive geometry. 


9.1. Mappings in descriptive geometry 
Central projection 


In central projection a mapping is carried out by means of a bundle of rays whose carrier, the 
vertex of projection V, lies outside the plane of projection IT. For an arbitrary point P + V the central 
image or perspective image P¢ is the point of intersection P° = rp ~J7 of the ray rp = VP with the 


204 9. Descriptive geometry 


image plane J7. Under this pro- 
jection all points of a plane JJ, 
that passes through V and is 
parallel to J7 are mapped onto 
improper points (points at infi- 
nity) of ZZ. This plane Z, is 
called the vanishing plane (Fig.). 
The central image /° of a line I 
that does not pass through V 
and does not lie in JJ, is a line, 
since the rays that project its 
points, for example r, = VA 
and rz; = VB, form a plane, . 
which cuts J/ in a line (Fig.). 
The trace point L = (J ~1) lies 9.1-1 Image plane and vanishing 9.1-2 Mapping ofa line by central 
on /[¢*. The central image /° plane for central projection projection 
of / is uniquely determined by 
the central images A‘ and B¢ of two of its points A and B. The point of intersection L, of / with //, 
is called the vanishing point of 1; its image is the improper point of /*. The image of the improper 
point of / is the point of intersection L¢ of JZ with the ray r, parallel to / through V. This vanishing 
point L¢ is the image of the common point at infinity of all lines parallel to /. The vanishing point 
of all lines perpendicular to /7 is the foot of the perpendicular from V to // and is called the principal 
point or principal vanishing point H. The length d of the segment VH is called the distance. The 
vanishing points of all lines that cut J7 at an angle of 45° lie on a circle with centre at H and radius d, 


the distance circle. 


Parallel projection 


If the vertex V of the bundle of rays giving the mapping is at infinity, the projection is a parallel 
projection of the points P of space onto the points P’ of IZ. Since the rays that project the points 
of a line p form a plane, the image p’ of a line is a line, in general, and the images of two parallel 
lines p and gq are parallel. Only when the given line /p is parallel to the projecting rays, then its 
ar is a point Ly = (Ig ~ 1). For lines that do not lie along a projecting ray the following theorems 

old: 


The ratio of division of three points ona line is inva- 
riant under this mapping, for example |AB|: |BC| 
= |A’B’|:|B'C"| (Fig.). 

The ratio of two segments that lie along parallel 
lines is invariant under parallel projection, for 
example, |AB|: |DE| = \A’B’|: |D’E’|. 


9.1-3 Invariance of ratio 
of division under parallel 
projection 


9.1-4 The image of a 
plane figure lying parallel 
to the image plane is 
congruent to the original 
under parallel projection 


The image of a figure in a plane 
parallel to IT is congruent to the 
original figure, for example, \PQR 
= AP'OR' (Fig.). 


Oblique projection. If the solid to be mapped has three mutually 
perpendicular edges or axes of symmetry a, b, c, then an oblique 
projection of the solid (skew parallel projection) can be obtained 
constructively. If the image plane J7 is taken to be vertical, then the 
solid can be brought into the ‘ordinary position’, where two of the 
9.1-5 Oblique image of a three axes are parallel to Z7, with one, say 6, horizontal and the 
section of a cube other, say c, vertical (Fig.). The third axis a and all lines parallel to 


9.2. The two-plane method 205 


it are then perpendicular to // and are called depth lines. Their images are parallel and make with 
the image of b the distortion angle y = (a, b), where da is the image of a. The ratio A = @: a of the 
image segment to the original segment on the depth line is called the distortion ratio. From the 
distortion angle y and the distortion ratio A the measurements of the spatial figure can be recon- 
structed from the drawing. For ease of construction ¢ is chosen to be 30°, 45°, 60° or 120°, which 
is convenient to draw with set squares, and the distortion ratio J is chosen to be a simple rational 
number such as 1, 1:2, 2:3, 1:3 or 3: 4. 


Normal projection or orthogonal projection. In this mapping the parallel projecting rays are per- 
pendicular to the image plane /7. The much reduced clarity is compensated in two ways. 

1) Chosen points or lines of the object are labelled by means of their heights above the horizontal 
reference plane. This leads to a one-plane projection with heights. It is applied particularly in the 
projection of earthworks and representation of landscape. 

2) The normal image is compared with a second normal image in the same drawing. This is done 
in such a way that the directions of projection and the image planes giving the two normal images 
are perpendicular. The method of two-plane projection sketched here, or the corresponding normal 
images, are applied in machine construction and architecture. 


9.2. The two-plane method 


In the two-plane method the spatial object is mapped onto two perpendicular planes /7, and J7, 
by normal projection (Fig.). These planes divide space into four numbered quadrants (see Fig. 
9.2-3). A spatial object is mapped ontoJJ, by the first and onto/7/, by the second projecting parallel 
line bundle. In this way there arise two normal projections of one object in two perpendicular planes, 
namely the plan in J7, and the elevation in IT, . The convention is that the plan is horizontal and the 
elevation vertical. The object to be represented is preferably drawn in the first quadrant. For the 
convenience of the draughtsman in working with these two images of one object, the elevation is 
placed in the drawing-board and the plan is rotated about the horizontal ground line x, into the 
plane of the drawing-board. After this rotation the elevation lies above the ground line and the 
plan below it. The plan and elevation of a point P lie on a line perpendicular to the ground line, 
This is called an order line. One says that the plan P’ and elevation P’’ of a point P are in the Monge 
Position. The distance of P’”’ from the ground line is called the first distance d, and that of P’ from 
the ground line the second distance d, of P. Corresponding to the plan P’ and the elevation P” 
of P are the first projection ray r, and the second projection ray r,, both passing through P. 

If the object lies entirely in the first quadrant, then the elevation is always above and the plan 
always below the ground line. In the course of spatial constructions one may come across points 
from the other three quadrants. For example, the points A, B, C, D lie in the quadrants I, II, III, IV 
(Fig.). The distances d; satisfy the following inequalities: for A: d,, dz > 0, for B: d, > 0, d, < 0, 


if 


9.2-2 Plan and elevation of four points 
A, B, C, D. A lies in the first quadrant, 
B in the second, C in the third, D in 
the fourth. 


9.2-1 Oblique image as representation of a spatial 9.2-3 Planes of coincidence and sym- 
object in coordinated normal projections metry 


206 9. Descriptive geometry 


for C:d,, d, <0, for D: d, <0, d, > 0. For points whose plan (elevation) lies on the ground 
line d. = 0 (d, = 0). All points of the plane of symmetry o (coincidence plane ~) satisfy the equation 
d, = d,(d, = —d,) (Fig.). In the two-plane projection two image planes cover one another in the 
drawing-board. By a suitable combination of constructions of plane geometry, applied to the two 
images of an object, problems of spatial construction can be solved. The ground line x,. is therefore 
not to be regarded as a line of separation of the plan and elevation, but it illustrates the position of 
the line of intersection of the two image planes before the rotation into the drawing-board. 


Representation of lines and planes 


Representation of a line. The projections /’ and /’’ of a line / are uniquely determined by the projec- 
tions of two of its points. For a first principal line h,, parallel to 17, , the elevation hj is parallel 
to the ground line x2; for a second principal line hz, parallel to I7,, the plan hy is parallel to x;2. 
The position of an arbitrary line / that does not cut the ground line and is not a principal line is 
determined by its traces, its points of intersection with /7, and J7,. Here L, = (/ ~J11,) is called 
the first trace and L, = (/ ~J7,) the second trace (Fig.); Li and L3 lie on x,2. The point of inter- 
section K’ = K”’ of the projections of the line characterizes the point of intersection K of the line / 
with the coincidence plane. For a first projecting line | perpendicular to 7, one has l’ = L,, and for 
a second projecting line | perpendicular to /7, one has |’ = L2. Two lines p and q intersect in a point S 
of space if and only if the points of intersection 1’ = (p’ ~ q’) and 2” = (p” ~ q”) are in the Monge 
position. Then 1’ = S’ and 2” = S”’. Otherwise the lines p and q are skew (Fig.). 


9.2-5 Skew and intersecting pairs of lines 


9.2-4 Plan and elevation of a line / with its trace points L, and L,; since d, = —d,, K’ = K” lies in the 
coincidence plane 


Representation of a plane. A plane £ is fixed in position by two intersecting lines. For example, 
if the lines p and gq intersect in P (Fig.) and P,, P, and Q;,Q> are their traces on J7, and J/,, then 
the lines e; = P,Q, and ez = P,Q, lie in the plane E£, and e, lies in 7, and e2 in J7,. These lines e, 
and e, are the traces of the plane in /7, and J7, and can be constructed from the traces of p and q. 
The traces e, and e, intersect on the ground line x, at the 
node K of the plane. 


9.2-6 Representation of a plane and lattice of a point P; a) in 
the oblique image, b) in coordinated normal projections 


9.2-7 Section of plane and prism, solution by means of lattices 
with first principal lines 


9.2. The two-plane method 207 


For constructions with in a given plane the principal lines or trace parallels are useful as an auxiliary 
tool. First principal lines, first trace parallels or height lines are parallel to e,; second principal lines, 
second trace parallels or front lines are parallel to e,. For example, it is easy to establish whether 
a point P determined by P’ and P” lies in a plane E determined by e, and e, or not. For if the 
elevation h{’ of a first principal line A, of E is drawn through P”’, then the plan Aj of this principal 
line is uniquely determined. If P’ lies on the line 4, found in this way, then P is a point of E, otherwise 
P lies outside E. The model described here is called the /attice of a point P. This lattice is also pos- 
sible for an arbitrary line in E. An application of lattices enables, for example, the construction 
of the intersection of a prism perpendicular to J7, with a plane. The plans of the points of inter- 
section of a, b, c coincide with a’, b’, c’. The elevations of the points of intersection are found by 
means of first principal lines, using lattices (Fig.). Fall lines intersect the height lines at right angles 
and point in the direction of greatest slope of the plane with respect to a horizontal plane. Their 
plans are therefore perpendicular to the plans of the height lines. 

Special positions of the plane FE can be characterized by the position of the traces e, and e2. For 
a first projecting plane perpendicular to /7, one has e2 | x,2, for a second projecting plane perpen- 
dicular to /7, one has e; | x42, and for a first and second projecting plane both traces are perpen- 
dicular to x,2. On such a plane one obtains a side elevation to supplement the plan and elevation 
of a solid. For a desk plane e,; and e2 are parallel to the ground line. If they coincide with it, then the 
plane contains the ground line and can only be represented by principal lines. 

A plane is called direct if the plan and elevation of a triangle ABC in the plane have the same 
sense of rotation. If the senses of rotation of the two images of the triangle are opposite, the plane 
is called alternating. 


Determination of true measurement from the corresponding normal projections. The distance be- 
tween two points A and B is equal to that of the segment 4B measured along the line / = AB. If / 
is a principal line with respect to the image plane, then normal projection of the segment gives its 
true value, for example, A, in 7, and Az inJ7,. If A and B lie on a line / in general position, then 
/ can be rotated about a first or second projecting line as axis so as to lie in one of the above positions 
relative to the image plane. For example, / can be rotated about the first projecting line a through 
A into the position of a second principal line (Fig.). If C is the foot of the perpendicular from B 
to a, then the right-angled triangle ABC can be rotated about the side |AC| = |A”C”|. The 
other side is |CB| = |A’B’|. The point B moves in space on an arc of a circle parallel to 1, 
with centre at C'and radius |.4’B’|. In/7, the point B’’ moves on the line through B” parallel to the 
ground line x,2 towards the point B*’’, which is determined by |B*’’C”’| = ined (Fig.). Apart from 
the true distance |AB|, this construction gives the first angle of inclination «, of the line /= AB to 
the ground plane 17,; a, = {AL,L, = {ABC = YA” B*’C”. Similarly the second angle of 
inclination «2 that / makes with /7, is obtained by rotation about a second projecting line perpen- 
dicular to J7,. 


9.2-8 Oblique image for the determination of the distance 9.2-9 Determination of distance by means 
of two points in space of parallel rotation 


208 9. Descriptive geometry 


The true shape of a plane figure can be 
determined by rotation parallel to a plane 
of projection. It is natural to use a first or 
second principal line of the plane of the 
figure as axis of rotation. 


Example: The true shape of a given tri- 
angle ABC can be determined from the 
plan and elevation by rotating it about the 
first principal line A, through A into a 
position parallel to the ground plane. 
Under the rotation A remains fixed, while 
B and C describe arcs of circles whose 
planes have the axis of rotation h, as com- 
mon normal (Fig.). The radius r- of the 
arc described by C is the hypotenuse of 
a right-angled triangle with |C’M-| and 
Ae as sides, where Me is the foot of the 
perpendicular from C’ to A, and the seg- 
ment fA, is taken from the elevation. If this 
is drawn from M, along h; to its end-point 
S$", then |.S°C’| = rc. By laying off this seg- 
ment r- from Me along the order line 
through C’ perpendicular to A, the point 
C; is obtained, which is the result of rotat- 


ing C about A,. The point B is treated 9.2-10 True form of a plane figure by application of 
similarly. the double compass method 


Since in this method, in addition to the order line perpendicular to the principal line through 
the point to be rotated, two distances have to be laid off with compasses, it is called the double 
compass method. As a check on the drawing, one should verify that the lines B’C’ and B,C; meet 
in a point on the axis of rotation 4, . The true shape of a plane figure also takes care of the angle of 
intersection of two intersecting lines in space. 


Perspective affinity 

If an arbitrary plane figure in space is rotated about one of its traces into the image plane corre- 
sponding to that trace, then the resulting figure and the normal projection of the original figure onto 
the same image plane are related by an orthogonal perspective affinity; for example, the triangle 
ABC (Fig.) gives the normal projection A’B’C’ and the triangle A,B,C, under rotation about e, 
into /7,. For this mapping the trace e, is the axis of affinity. The rays of affinity from the original 
to the image point, A’A, for example, are parallel and the direction of affinity is perpendicular to 
the axis of affinity e, . The correspondence between the original and image is one-to-one and linear, 
since lines /’ go into lines /,. These lines /’ and /, meet at 
a point LZ of the axis of affinity. Each point of the axis is 
mapped onto itself. Parallel lines go into parallel lines, and 
the ratio of division of three points on one line is equal 
to that of the image points, for example, |A’D’| : |D’B’| 
= |A,D,|:|D,B,|. The ratio in which the line joining an 
arbitrary pair of points is divided by the axis of affinity is 
called the characteristic of the perspective affinity. 


B B° 


9.2-11 Perspective affinity of the plan of a plane 9.2-12 Shear or equivalent perspective affinity, 
figure and the result of rotating it with a as axis of affinity 


9.2. The two-plane method 209 


Under a skew or general perspective affinity the direction of affinity can make any angle with 
the axis of affinity. 
Under a shear or equivalent-perspective affinity the rays of affinity are parallel to the axis of 
affinity (Fig.). 
A perspective affine mapping in the plane is uniquely determined by the axis of affinity and a pair 
of points that correspond under the mapping. 


On the drawing-board there is a perspective 
jaffine relationship between the plan and elevation 
of a plane figure. The order lines of all points, 
,D'D” for example, give the direction of affinity, 
and the images /’ and /” of any line / of the plane 
{figure intersect at a point of a line s in the drawing- 
jboard. This is the axis of affinity of the perspective 
affine mapping. The plan and elevation of the line 
«of intersection & = (x ~ £) of the coincidence plane 
ix and the plane E of the figure coincide with this, 

that is, k° = k” = s (Fig.). 


Between the plan and elevation of a plane figure 
there is a perspective affine point relation. The 
rays of affinity coincide with the order lines and 
the axis of affinity s coincides with the identical 9.2-13 Perspective affinity of the plan and 
images of the coincidence line k in the drawing- elevation of a plane figure 
board. 


This geometric property can be used constructively, for example, to obtain the lattice of a point D 
lying in the plane. 

The ellipse as perspective affine image of the circle. The perspective affine mapping of a circle 
with centre at C is determined by the axis of affinity s and the image C’ of the centre C. The direction 
of affinity is CC’. Since any line cuts its image on the axis of affinity, the image of any point P can 
be constructed as the point of intersection of two lines, for example, the image P’ of P is obtained 
from the two conditions CC’ ||PP’ and (CP ~ s) = (C’P’ ~'s). 

The images of perpendicular diameters of the circle are called conjugate diameters of the ellipse 
(see Chapters 7. and 25.), for example, P’R’ and Q’S’, the images of PR | QS. Since parallels go 
into parallels and the ratio of division remains unchanged, there are important relations between 
parallel chords and tangents. 

A diameter of an ellipse bisects all chords parallel to its conjugate diameter, and the tangents at 

the end-points of a diameter of an ellipse are parallel to the conjugate diameter; for example, P’R’ 

bisects the chords parallel to Q'S’, and the tangents at P’ and R’ are parallel to Q’S’. 


Among the pairs of conjugate diameters there is exactly one that forms an orthogonal pair of 
diameters. These are the major and minor axes of the ellipse. If K, and K, (Fig.) are their points 
of intersection with the axis of affinity s, then both C and C’” lie on the circle with K,K, as diameter. 
Its centre N is the point of intersection 
of the perpendicular bisector of the 
segment CC’ with s; its radius is |NC| 
= |NC’| = |NK,| = |NK3|. 


14 Mathematics 


210 9. Descriptive geometry 


The axis of affinity s and the point pair C, C’ are sufficient to 
construct the affine transformation of the circle ~ into the 
ellipse x’. The axis s cuts CC’ in Cg and PP’ in Po. For the 
construction of a further arbitrary pair of points X, X’ the pro- 
perties of the affine relation are used (Fig.): CC’ ||PP’ ||XX’ and 
|PPo| : |P’Po| = |CCo| : |C’Co| = |XXo| : |X’Xo| = k, where k is 
the ratio of affinity. 

To construct points of the ellipse with ruler and compass an 
orthogonal affine transformation is applied, with the major axis 
of the ellipse as axis of affinity and the minor axis as direction 
of af finity. The vertices 4 and B on the major axis remain 
fixed under this transformation, while the vertices D’ and E’ 
on the minor axis are the images of D and E (Fig.). The ratio of 
affinity is given by k = |CD’|:|CD|. The ellipse is therefore the 
affine image of the circle x, with centre at C and radius |CA}. 
From the affine relation between x, and the ellipse the follow- 
ing construction for the ellipse can easily be derived: One draws 
a second circle x2 with centre at C and radius |CD’|.. One chooses 
an arbitrary point P, on x,. The line CP, cuts x2 at P,. One 
draws the perpendicular from P, to s and the line parallel to s through P,. Then the perpendi- 
cular and parallel intersect in a point P of the ellipse. 

Proof. By construction |CD’|: |CD| = |CP2|: |CP,| = |LP|: |LP;| = k. Because the direction of 
affinity is perpendicular to s, P is the image of P,. This construction for the ellipse is known as 
the two circle construction. There are other constructions for the ellipse (see Chapter 7.) and a 
parametric representation can be derived for it (see Chapter 13. — Equations of the ellipse). 


9.2-16 Two circle construction of 
the ellipse 


Side elevations, rotations, and representations of solids 


The preceding fundamental constructions, in terms of coordinated normal projections, are often 
applied to the representation of spatial objects, as the following examples show. 


Example J: From the plan and elevation of an octahedron in special position with respect to 
the image planes a representation in the general position can be obtained by means of two side 
elevations. For example, the point 1’” has the same distance from x23 as 1’ from x, and 1!¥ has 
the same distance from x34 as 1” from x23 (Fig.). By suppressing the projecting lines and surfaces 
.one obtains a natural representation of the solid. However, it is no longer possible to obtain 

the measurements of the solid directly from this projection. Apart from its usefulness in giving 


9.2-17 Octahedron in plan, elevation and side elevation 


9.2-18 Intersection of line and sphere: application of rotatic ; = : oy 
of a solid as a principle of construction 


9.2. The two-plane method 211 


a Sketch of the appearance of an object, the side elevation is applied as a principle of construction. 

The transfer of points to a new side elevation is carried out according to the rule: the distances 
of points of the suppressed projection from the suppressed axis are transferred from the new 
ground line along the corresponding order lines to the new projection. 

Example 2: The intersection of a line with a sphere can be constructed by means of rotation. 
The first projecting diameter a of the sphere is chosen as axis of rotation (Fig.). The line / is 
rotated about a so that its final position / is parallel to /7,. Two suitably chosen points 1 and 2 
of / go into I and 2, and the sphere goes into itself. The first projecting plane I’ through 1 cuts the 
sphere in a circle c, whose elevation coincides with its true form. Hence the points of intersection 

” and B” of c’ with /’”’ are the elevations of the points of intersection of / with the sphere. The 
lines through A” and &” parallel to the ground line cut /” in A” and B”. This has the effect of 
reversing the rotation in the elevation. The plans of the points of intersection A and B of / with 
the sphere are found by means of the order lines through A” and B”. 

Example 3: The tangent plane to a sphere at a given point P of the sphere can be constructed 
by means of the first and second principal lines. Since the tangent plane T is perpendicular to 
the radius rp through P, the two principal lines 4, and A, of T that intersect at P make right 
angles with rp (Fig.). Hence Lhjrp = {Ay rp = 90°. Since A,’ and A; are parallel to the ground 


constructed. 

Example 4: To find the intersections of a line / with a cone whose vertex Z and trace curve s 
in JT are known. The principle of solution can be seen from an oblique image (Fig.). The plane 
I’ through Z and / has the trace line c in J7. If c cuts the trace curve s in points P,, P2, P3, Ps, 
then the lines ZP, are generators of the cone lying in I’. Their points of intersection 1, 2, 3, 4 
with / are therefore the points of intersection of / with the cone. 

Example 5: To construct a cone of rotation whose base circle lies in a plane E determined 
by its traces e,; and e,, where the height / of the cone, its base radius r and the plan C’ of the 
centre C of the base circle are given (Fig.). The construction can be divided into steps. 1. By means 


iy 
h3 e 


9.2-19 Tangent plane to a sphere 


9.2-20 Intersection of line and cone in the oblique 9.2-21 Construction of a cone of rotation on a 
image given base plane 


212 9. Descriptive geometry 


of the first and second principal lines the elevation C”’ of C from C’ has to be constructed. 2. The 
diameter of the base circle is mapped in the plan on Aj and in the elevation on A‘ by its true length. 
|A’B" and tg ‘Q"! are the major axes of the two image ellipses. 3. By means of order lines, one finds 
the points A”, B” on hf,’ and P’,Q’ on h}. 4. By the paper strip construction one finds the minor 
axes of the image ellipses and hence these ellipses themselves as images of the base circle of the 
cone. 5. On the perpendicular line / through C perpendicular to E with projections /’ | A‘, and 
I | hy one takes an arbitrary point N + C and rotates the segment CN, keeping C fixed, so that 
it is parallel to //,. 6. If CN. is the position of the perpendicular after rotation, a segment of length 
h is marked off along CN,’ starting from C” to the point Z”. The horizontal line through Z;3’ 
cuts C"N” in Z”, The order line through this elevation of the vertex of the cone gives the plan 
a 7. Tangents from Z’ and Z”’ to the corresponding images of the base circle can be constructed 
| using a perspective affinity, for example. This determines the required projections of the cone 
of rotation. 


The six principal projections. The side elevation whose plane /73; is perpendicular to JZ, and 
IT, is called the cross elevation (Fig.). Their lines of intersection x,2, x23 and x,3 are perpen- 
dicular in space. The figure shows that not every spatial object can be uniquely reconstructed from 
its plan and elevation. 

Quite generally, an object can be pro- 
jected at right angles onto 6 planes that | 
form the surface of a cube (Fig.). One 
obtains the six principal projections of a 
spatial structure in the European repre- 
sentation. In the American representation 
the appearance of the object is treated in 
the opposite sense. 


bottom view 


right side | frontview| left side 
Mier Fi€n’ 


eS | Ea} NS) 


fop view | 
top view | 
a I 
) 


‘eff side 


9.2-22 Oblique image from the plan, eleva- 9.2-23 Six principal projections in the European and 
tion and cross elevation of a spatial object in the American arrangement 


9.3. Further mappings 


Projection with heights — the one-plane method 


In the projection with heights a point P of space is mapped onto its image point P’ by a projecting 
ray normal to the image plane /7, and its distance k = |P’P| is given in terms of a fixed unit of length e 
as the height. The plane J7 is usually taken to be horizontal and the positive half-space with k > 0 
above it. The image /’ of a line / is fixed by the images P’ and Q’ of two of its points P andQ. By 
marking off their heights (taking account of sign) on two parallel lines through P’ and Q’ one obtains 
the trace point L of | (Fig.). Conversely, a line in space is uniquely fixed by the images of any two 
graduation points. If its interval i is understood to be the distance between the projections of two 
graduation points whose heights differ by one unit, then its angle of inclination « = <X(i, I’) to the 
image plane is determined by the equation i = e cot « (Fig.). 

By means of projection with heights one can determine whether two non-parallel lines a and 6, 
each given by two graduation points A, B and P, Q, intersect or not (Fig.). Points with the same 


9.3. Further mappings 213 


P(3) 


9.3-2 Interval and angle 
of inclination of a line; 
oblique image, 

projection with heights 


9.3-1 Representation of point 
and line in the projection with heights 


height lie on a plane parallel to /7, for example, P(2) and B(2) on the plane k = 2. A plane through 
P(2), B(2) and A(3) cuts the plane k = 3 in a line p parallel to P(2) B(2), which contains the line a. 
The line 5 cuts a only if it lies in this plane, that is, if Q(3) lies on p. 


9.3-3 Projection with heights of two skew 
lines a and 6; 


a) oblique image, b) projection with heights 


According as the lines joining any two points of the non-parallel 
lines a and 6 with the same height intersect or are parallel, so 
the pair of lines is skew or intersecting. For a pair of parallel 
lines a and 6 the lines joining pairs of points with the same 
height and their one-plane projections are parallel. 


A plane inclined to IZ can be represented by equidistant 
parallel lines with height as height-lines or by a fall-line f, cutting 
the height-lines at right angles. The position of such a plane can : ; 
be uniquely described by means of a graduated fall-line. The te oe ean a.DienS 
height-line with height 0 is the trace of the plane (Fig.). 

The intersection of two planes given by graduated fall-lines can be obtained by finding the inter- 
sections of height-lines with the same height (Fig.). This principle is applied in problems concerned 
with slopes and roofs. 


9.3-5 Intersection of two planes in the one-plane method 


9.3-6 Intersection of a 
line and plane in the 
one-plane method 


214 9. Descriptive geometry 


If a line / and a plane E are given by their gradient scales, then the family of parallel lines through 
the graduation points of / are height-lines of a plane E, that contains /. The line of intersection s 
of E and E, cuts / in its point of intersection D with E (Fig.). 


Contour plan. If a general surface is cut by a family of planes parallel to 77 with integral heights, 
a family of contour lines is obtained. The normal projection of these curves on // gives the contour 
plan of the surface. For example, the contour plan of a cone of rotation with axis perpendicular 
to 77 and vertex Z is a family of concentric circles with Z’ as common centre. The radii of the circles 
can be taken from a generator of the cone marked with heights. For a one-sheet hyperboloid of 
rotation the contour plan is also a family of concentric circles, whose radii are uniquely determined 
by the normal projection of a generator g of the surface, marked with heights (Fig.). 


i 


a, Is 9.3-7 Contour plan of a one-sheet hyperboloid of rotation 
13 
rs 
/ / yo 
| ( , | 
apa 0 | 
| _ 
\ \ \, | 
\ \ : -4 
\ 9 / 
ae =9 9.3-8 Intersection 
Pe of line and cone of 
ee, “4 rotation by the 


one-plane method 


The intersection of a cone of rotation with a line / (Fig.) can be constructed by means of a plane J" 
containing / and the vertex Z of the cone. Its trace e in // contains the trace point L of / and is parallel 
to the line that joins Z to the point of / that has the same height as Z. This trace cuts the contour 
line 0 of the cone in its point of intersection with the generator on which / meets the surface of the 
cone. 

Surfaces that can be comprehended only by measuring a larger number of points in a contour 
plan are called topographical. The representation of such surfaces by means of a contour plan is of 
practical importance for the construction of slopes on highways. In addition, this method of represen- 
tation has diverse applications in the manufacture of propellors for ships and aircraft, and for 
aircraft wings and body work of vehicles. 


Axonometry 


In order to obtain as many measurements of a solid as possible from a visual image of the solid 
obtained by parallel projection, the solid is referred to an orthonormal trihedron O (X, Y, Z) and 
this, together with the solid, is projected onto the drawing-board, where its image is a trihedron 
Os (XS, YS, Z°). According to the direction of incidence of the projecting rays one distinguishes 
between general, or skew, axonometry and orthogonal, or normal, axonometry. Instead of the single 
unit segment |OX| = |OY| = |OZ| = e in the original figure, there are three, |OSX5| = e,, |OSY5| =e, 
and |O%Z5| = e, in the image, which can have different lengths 
and are obtained from the image OS(XS, YS, Z*) (Fig.). 

Pohlke’s theorem contains a condition under which a plane tri- 
hedron can be regarded as the parallel projection of a spatial 
trihedron. 

Pohlke’s theorem. A plane trihedron O*(X*, Y*°, Z°) can be 
regarded as the parallel projection of an orthonormal spatial tri- 
hedron O(X, Y, Z) if the four points O*, X‘, Y*,Z* are not all 
collinear. 

An orthonormal spatial trihedron is also called a vertex ofa 
cube and its parallel projection a Pohlke trihedron. 

Special methods. A special case of an axonometric mapping is 
the oblique projection, for which e, = e, = 1, y* | z5, while the 
9.3-9 Pohlke’s trihedron scale e, and the direction of the x-axis can be prescribed arbitrarily. 


9.3. Further mappings 215 


This is a question of a dimetric skew axonometry, which is also known as frontal axonometry. The 
cavalier perspective is a special case of the oblique projection method; in this case, e, =e, =e, = 1, 
y= | 2 and <(x5, y’) = 135° (Fig.). The military perspective or bird’s-eye view is characterized by 
é, =e, =e, = 1, x 1 y® and 2z* vertical (Fig.). Like the cavalier perspective, it represents an 
isometric axonometry and is used to give a more lifelike impression of a building than the plan does. 

In practice one frequently uses isometric, dimetric, or trimetric representation, observing the 
rule of standardization demonstrated by the example of a section of a cube (Fig.). 


9.3-10 Model of a house in cavalier 
perspective 


9.3-11 Model of a house in military 
perspective (bird’s eye view) 


9.3-12 Images of a section of a cube: a) isometric, b) dimetric and c) trimetric 


If a representation of a spatial object is given by means of 
coordinated normal projections, then an axonometric image 
can be obtained by the indenting method due to L. ECKHaART. 
The two images are separated and placed arbitrarily on the 
drawing-board. For each projection an indenting direction is 
prescribed arbitrarily. The image points of the axonometric 
image lie at the points of intersection of the corresponding 
indenting rays (Fig.). The method requires some practice in the 
arrangement of projections and in the choice of indenting 
directions, to obtain an axonometric image that is not too 
much distorted. 


9.3-13 Axonometric image by the indentation method of L. ECKHART 


216 9. Descriptive geometry 


Normal axonometry. To obtain clear images 
it is assumed that none of the axes of the ortho- 
normal spatial trihedron O (X, Y, Z) is par- 
allel to the image plane //. Then its traces A, 
B, C are proper points. The trace triangle ABC 
with sides a, 6, c is, from spatial considera- 
tions, acute angled and it uniquely determines 
the Pohlke trihedron O”" (X", Y", Z") by nor- 
mal axonometry. The normal projection O” 
is the point of intersection of the altitudes /,, 
h,, h, of the trace triangle, and the points 
X", Y", Z" are found by reflecting the right- 
angled triangles BCO and CAO about the 
sides a and 5 into the drawing-board (Fig.). 

In the Pohlke trihedron the unit segment e 
appears shortened; the reduction factors are 


= e,:e = |O"A|: |OA|, 
[fb = ey: e = |O"B|: |OB|, 
vy=e,:e = |O"C|:|OC|. 


These satisfy a relation which is obtained in —9,3-14 Normal axonometric Pohlke trihedron and its 
the course of deriving Gauss’s theorem. application to a section of a cube 


r Gauss’s theorem. If the normal 
projections O' = O, X’= p, Y' =gq, 

é ‘=F of the zero point O and 
the unit points X, Y, Z of a spatial 
Cartesian coordinate system on the 
image plane /7 are regarded as 
points of a complex number plane, 
then p* + g? + r? = 0. 


9.3-15 Proof of the relation 
pP+g+ri=0 


To prove this, one represents the orthonormal trihedron O(X, Y, Z) by means of coordinated 
normal projections (plan and side elevation), where the side elevation is taken parallel to the OZ-axis 
(Fig.). It follows that |O’Z’| = |r| = ecos@. Furthermore, O’X’ and O’Y’ are conjugate semi- 
diameters of an ellipse, which is derived from a circle with centre O’ and radius e by reducing each 
chord of the circle parallel to the ground line x,3 in the ratio sin?: 1. If a Gaussian &, 7-number 
plane is set up in the horizontal plane of projection with O’ as origin and the imaginary 7-axis 
coinciding with O’Z’, then the points X’, Y’, Z’ correspond to complex numbers p, q, r. These are 
given by 

p=cos(x+ y) + isin? sin (x + y) = —cosy — isin# siny, 
gq = cos (32/2 + y) + isin é sin 32/2 + y) = siny —isin?cosy, r=icos®. 


By squaring and adding one obtains p? + g? + r? = 0. If one takes the moduli of the complex 
numbers and puts A = |p|, u = |q|, » = |r|, then it follows that A? + uw? + »? = 2. 


Central perspective 


Central perspective is a central projection and is used to construct clear images of spatial objects, 
usually given in terms of their plan and elevation. The converse problem of reconstructing the plan 
and elevation from central perspective images, usually photographs, is the domain of photogram- 
metry. This problem is soluble only if for an oriented camera its position relative to the object is 
known, or if, as in a cartographic survey by means of aerial photographs, the photographs contain 
certain guide points whose position is known. 


9.3. Further mappings 217 


To determine a central perspective mapping it is sufficient to know the centre of perspective or 
point of sight O, a horizontal position plane I’ not passing through O and a vertical image plane IT 
not passing through O. The plane 22 parallel to I’ through O cuts the image plane J7 in the horizon h. 
The image plane and the position plane intersect in the position line I. The foot of the perpendicular 
from O to ZZ is the principal point H. It lies on the horizon h. The segment d = |OH | = |O’H" | is 
the eye distance. 


Intersection and vanishing point methods. If the plan and elevation of a model of a house are 
given, then the central perspective image of the model relative to a vertical image plane /7 and 
point of sight O can be constructed point by point by the rules of the two-plane method. This is 
shown for the point P (Fig.). P’O’ cuts the position line /in P¢’, and P’’O”’ cuts the order line through 
Pe’ in P¢’’. In this way the image P* of P can be found. It can be transferred to another drawing- 
board, which corresponds to a reflection of [J about / and contains the horizon, the principal point 
and the vanishing points. By the rules of central projection all parallel lines have the same vanishing 
point. For lines parallel to the position plan J’ these vanishing points lie on the horizon A. For the 
family of parallel lines determined by AB or DC it is F, . Its plan is the intersection of / with the line 
through O’ parallel to DC; similarly, F, is the plan of the vanishing point for all parallel lines 
characterized by DA or CB. The vanishing point of all lines perpendicular to the image plane is 
the principal point H. Furthermore, the points of intersection of the base edges with the position 
line are helpful for the construction. The points 1, 2, 3, 4on/and F,, F, on A, obtained in the present 
example, can be brought into the central pro- 
jection by orientation on the principal point H. 
In this way the construction of the perspective 
plan is reduced to the joining of points and the 
intersection of lines. Now the heights of the first 
line and gutter line of the model of a house are 
taken from the elevation and laid off on the per- 
pendiculars to / through 1, 2, 3 and 4 in JJ. By 
using the vanishing points F, and F, the perspec- 
tive image of the model can be completed. 


'9.3-16 Central perspective image of a model of 
‘a house, constructed by the intersection method 


(9,3-17 Central perspective image / \ / P 
‘of a model of a house in the : Z \ Nae / rr Sy 
‘architect's arrangement j ; 


The intersection method was 
described by BRUNELLESCI 
(1377-1446). It has the 
disadvantage of being very 
space-consuming and requir- 
jing transfer of measure- §'=c” : 
ments, and is avoided in the “O 
architect’s arrangement. The plan in JJ, is so arranged (Fig.) that perpendiculars to JJ; give the 
perpendicular edges of the house in the central perspective image. The heights of all the front edges 
of the house with respect to the point of sight are carried over from the elevation, and the apparently 
reduced heights in the image are constructed not in the elevation, but by means of vanishing lines. 


—_—— c 
A'=0" oN © / 


218 9. Descriptive geometry 


Measurement problems of perspective. In the treatment 
of measurement problems of central perspective several 
tools are needed, such as the determination of the true 
length of a segment AB whose central projection A°B* is 
given. For segments perpendicular to the position plane I” 
the solution is simple. For example, if the true length 
of a segment m perpendicular to I’, given by m* = |A°B*|, 
is to be found (Fig.), a point F, is taken arbitrarily on h 
and joined to A‘* and B°. The line s* = A‘F, lies in I’ and 
cuts / in S, a point of the image plane JJ. Let t* denote the 
line BCF,, it intersects the perpendicular to / at Sin T. Now 
T also lies in IT, hence |S7| is the true length of the perpendi- —_9.3-18 True length of a perpendicular 
cular m given by a central projection. to the position plane Ir 


9.3-19 The measuring point method 


For lines that lie in the position plane the laying off of given segments in the central perspective 
images is important. The image s° of a line s in I’ cuts / in S and A in its vanishing point F, (Fig.). 
On s there is a point A with central projection A‘. The line s and the vanishing ray OF, are parallel. 
To derive the construction s is rotated in I’ about S to / and OF, is rotated about F, in 22 to A. 
Under the rotations, A goes into A, and O into M,. The lines AA, and OM, are parallel. They 
therefore span a plane which contains the lines OA and M,A,. Hence these two lines intersect in 
space. Since their common point must lie on the one hand in JJ and on the other hand on OA, it 
can only be the image point A‘ of A. Since the segment |SA,| gives the true length of the line segment 
|SA| given by a central projection, the true length of any image segment laid off on s* can be recon- 
structed with the help of the point M, by means of a construction in /7. The measuring point M, 
belonging to s can also be found in // from F, and the point O° obtained by rotating the eye O. 
With centre at F, and radius |F,O°| an arc of a circle is drawn, cutting 4 in M,. The line joining 
M, and AS cuts / in A,. |SA,| gives the true length of the segment |SA| given by a central projec- 
tion. If the image B° of a further point B is prescribed on s‘, then with the help of M, the true 
length of |SB| can be constructed in a similar way. Furthermore, the segment |A,B,| is equal to the 
true length of the segment |AB| given in the image. 

The method explained here for determining the true length of a horizontal segment from the central 
perspective image is known as the measuring point method, and the points M, belonging to all lines 
parallel to s is known as the measuring point of s. 

Furthermore, any given segment a can be repeatedly marked off on / and the points of division 
joined to M,. The joining lines, through their intersections with s°, generate a projective scale on 
these lines. 

By these methods the central projection of a cube standing on J’ can be constructed, given the 
visible projection |A°B‘| of one edge a (Fig.). In the construction it is assumed that the central projec- 
tion is given by the position line /, the horizon h, the principal point H and the eye distance d. 


Perspective collineation. Suppose that a point A in J’ and its central projection A° are given 
(Fig.). The planes I’ and 2 are rotated in the same sense about / and A, respectively, to /7. Then A 
goes into A° and O into O°. The chords AA®° and OO* are parallel and span a plane, which cuts /7 
in O°A°. Since this plane contains the visual ray OA, A‘ also lies on O°A°. Furthermore, the parallels 


9.3. Further mappings 219 
0°? 


9.3-20 Construction of the central projection 
of a cube given an edge A‘ B* 


OH and AA, span a plane, 
which also contains the visual 
ray OA and cuts the image 
plane JJ in the line HA,. Hence 
the point of intersection of O°.A° 
and HA, is the image point A‘ 
of A. The construction, confined to its essentials, shows that there is a perspective collinear relation 
between A‘ and A° with O° as centre of collineation, / as axis of collineation, and A as counteraxis. 


The central perspective image of a figure in I’ and its rotation about | into the image plane II are 
related by a perspective collineation (Fig.). 


9.3-21 Perspective collinear point 
relation between the central projection 
of a point A in and the result of 
rotating it 


9.3-22 Perspective collineation between a 
plane figure in I and the result of rotating it 
into the image plane JT 


220 10. Trigonometry 


Stereoscopic image pairs 


Spatial vision depends on the fact that each eye sees a different central perspective image of a 
spatial object. The two different images are called stereoscopic image pairs or stereo-images. They 
can be photographed with a stereo-camera of about 65 mm(# 2.56’) objective distance and separated 
by means of a stereoscope, and then seen one by each eye. Since a spatial object can be reconstructed 
from its stereo-images, stereoscopyis applied in surveying, 
criminology and investigations of accidents. Stereoscopic 
images can also be constructed, for example, by the inter- 
section method of central perspective. The distance of the 
two eye-points O and O is then taken as 65mm and the 
distance d as 200 mm. 

In the anaglyph method the two stereoscopic images are 
returned to the drawing-board in such a way that they emit 
physically different light, for example, differently polarized 
light or light of complementary colours green and red 
(Fig.). The images are viewed through filtered glass. Each 
absorbs the light from the image corresponding to the 
other eye. The separation of the images corresponding to = es 
each eye by coloration or polarization, in conjunction with 29.323 Two central perspective images 
an optical apparatus, gives the viewer an impression of of a model of a house 
spatial depth and plasticity of the object. In viewing the figure a red filter should be used for the 
left eye and a green filter for the right eye. 


10. Trigonometry 

10.1. Trigonometric functions .......... 220 Working with trigonometric tables .. 230 
Introduction of the trigonometric func- The addition theorems ............ 233 
HONS ee taueertesewsuewe rece iaecs 220 Consequences of the addition theo- 
Definition of the trigonometric func- FONG poinyeerckhs eabaeee Parra 235 
tions for arbitrary angles .......... 222 10.2. Trigonometric equations.......... 237 
Properties of the trigonometric func- Pure trigonometric equations ...... 237 
HONS sccheged aww sooeeeean sow as 225 Mixed trigonometric equations ..... 240 


10.1. Trigonometric functions 


Trigonometry is the study of angle measurement. This, however, does not mean the elementary 
angle measurement of plane geometry, in which the magnitude of the angle is read off on a pro- 
tractor, but calculation with special functions that depend on angles and are called trigonometric 
functions because of their use in trigonometry (the study of triangle measurement and calculation). 


Introduction of the trigonometric functions 


Sine. If a road rises uniformly through 3 m for every 100 m of its length, then the ratio of the 
increase in height A to the length of road traversed s, namely 3/100, is a measure of the steepness 
of the road (Fig.), that is, of the angle « between the road and the horizontal plane. The ratio h/s 
is a function of the angle x, which is called the sine of the angle «, and is accordingly defined in the 
first instance only for acute angles (Fig.). 


10.1-1 Ascent on an inclined plane (drawing exaggerated) 


10.1. Trigonometric functions 221 


10.1-2 Sine and cosine of the angle «; |AB,| = 52, |AC2| = e:; 
y = Z C hy/s, = he/s_; e1/s, = e2/52; A AB,C, is similar to A AB,C,; 
7 1 2 h/s = sin «, h = s sin «, e/s = cos 4,e = scos a 


Saisse~, f 
| 


- 
++} 


aes 
' a Hebd 
Sree it er 

ri. tte i -fe 
be = 
=tde 

+h 

eae L 

i 

toes 
ae ee 

vor 
aes, ee roe eps 
: Tree eee 
ae | - Tae eee 
| aha. a rrr 
t | 
+ 
fe Poptienn oe 
' 

: i ae ae i 


au 


See Ses eee ae S = 
ee oy = im: al. = 
re ah ree 


as 
z a 
io 
ty he wore 
=, % 4 Paes 
ae. _ bee dubs 
poe [2299 eee wr 
jd ba . oon eee 
ee | bi d-de-bedbeea 


a 


Ge NEN 


taeha 
isa ba 
Hl TERS St 3 BE a, es 
teed |. 
pe eee 
ol 


ert pt ets 


, 
= = . 
teed al 
i : i: ai 
rere eb = 
at ds | - 
a 5 
t a, 
re mam 
He baot2 


all 
10.1-3 Graphical deter- pt Sa ig Ht 
mination of the sine and ——-f-—} ae 
cosine of an acute anglea BET 
from the ratios h/s and ; | 


e/s; for example, 


sin 40° ~ 0.643, LobAb TE TSS ESTE Ree EE Se 
cos 40° = 0.770 iia Sa 


From a sufficiently large drawing on squared paper the value of h/s = sina can be read off. 
The graphical determination of the sine is particularly simple if the divisor s is a power of 10, for 
example 10cm (Fig.). For « = 40°, for example, one obtains h/s = 6.43 cm/10.00 cm = 0.643. 
The accuracy of this method is not very high, but it can be increased by enlargement of the drawing. 
For every acute angle « the sine function has a fixed value, which is always less than 1 and for 
greater angles is greater than for smaller angles. 


Cosine. On maps the projections e on 
the horizontal plane of the inclined seg- 
ments s appear as map distances (Fig.), 
The ratio e/s is also a function of the 
angle «, called the cosine: cosa = e/s, 
e=scosa. The values of the cosine 
function for an acute angle « decrease as 
the angle « increases. One obtains graphi- 
cally cos 40° ~ 7.70 cm/10 cm = 0.770 
(Fig. 10.1-3). 


10.1-4 Projection of an inclined line seg- 
ment s onto the plane of the map (plan) 


Tangent. The gradient of a road is characterized by the ratio h/e = tan « of the increase in height / 
to the horizontal distance e, and as a function of the angle « it is called the tangent of «. A gradient 
of 8% therefore means a difference in height of 8 m over a map distance of 100 m (see linear func- 
tions). 


Cotangent, secant and cosecant. Because there are six ratios, in general, between three distances, three 
more relationships between the lengths s, 4, e and the angle « can be defined. Of these the cotangent 


222 10. Trigonometry 


is the reciprocal of the tangent and the remaining two, secant and cosecant, are used less frequently, 
for example, in astronomy or in navigation. 


sine: sina = h/s, cosine: cos « = e/s, 
tangent: tan « = h/e, cotangent: cot « = e/h, 
secant: seca = s/e, cosecant: coseca = s/h. 

In the right-angled triangle ABC with hypotenuse c (Fig.), the 
side a opposite the angle « and the side 5 adjacent to the angle « 
are known as the opposite side and adjacent side, respectively. 10.1-5 Angle « 


The definitions of the six angle functions are then: in the right-angled triangle 
a opposite side a opposite side c hypotenuse 
c hypotenuse b adjacent side b adjacent side 
b adjacent side b adjacent side c hypotenuse 
cose = — = ——————_., cote = — = ———_,,_ coset a = — = ——————_, 
hypotenuse a opposite side a opposite side 


Between these trigonometric functions several relationships hold, which can easily be verified in 
the right-angled triangle (Fig.), but hold quite generally for an arbitrary angle «: 


The angle 45° occurs in a square of side 1 with the 
diagonal d= 2, and the angles 30° and 60° in an 
equilateral triangle of side 1 with the altitude A = 1/,//3 
(Fig.). For the four trigonometric functions commonly 
used one obtains the values given in the table. These 
values are also shown calculated to four decimal 
places. 

A few of these values are rational and the remain- 
ing ones are irrational, but algebraic. Using properties 
derived from the addition theorems, one can obtain 


10.1-6 10.1-7 Equilateral from these by algebraic operations the values of the 
Square of side 1 triangle of side 1 trigonometric functions for 9/2, 9/4, ... and for 29, 


Function 


sin ~ 


COS @ 0. 5000 
tan p 1.7321 
cot p 0.5774 


3—~, 49, 5p, ... In general, the values of the trigonometric functions are transcendental numbers, 
whose values can be calculated from infinite series to any desired degree of accuracy. 


Definition of the trigonometric functions for arbitrary angles 


The definition of the trigonometric functions sine, cosine, tangent and cotangent for angles of 
arbitrary magnitude, and not only for acute angles, is based on the consideration of a Cartesian 


+y 


10.1-8 
Coordinate 
systems with 
rectangular 
axes 


10.1. Trigonometric functions 223 


coordinate system (Fig.), usually a left-handed system, in which an anticlockwise rotation is regarded 
as a rotation in the positive sense. 

In mining geometry and in geophysics right-handed systems are often used, whose x-axis points 
to the North and y-axis to the East and in which the positive sense of rotation is clockwise. Some 
possible coordinate systems with rectangular axes are shown in the figure. 


Definition on the unit circle. In a plane Carte- 
sian coordinate system an angle g can run 
through all four quadrants. Its magnitude can be 
measured in degrees, in new degrees or grades, 
or in radians (see Chapter 7.). Its moving arm 
cuts the circle with radius r = 1 and centre at 
the origin O, the so-called unit circle, in a point 
B, (Fig.). For the intersection Bo of the x-axis 
with the unit circle, gy has the value zero. During 
one complete revolution of the moving arm of 
gm about the origin, g runs through all values 
from 0° to 360°, or 27. The relations to be de- 
rived also hold for angles greater than 27, since 
for them the point B; assumes the same positions 
as for angles between 0 and 22. The position of 
the points B;, for example, B,, Bz or B3, Is 
determined by its coordinates. The abscissa is the 
orthogonal projection of the particular radius 
r=1 on the x-axis, and the ordinate is the 
orthogonal projection of this radius on the y-axis. 
For the position B;, for example, their signed 


numerical values are both negative, that > OCs; 10.1-9 Definition of the trigonometric functions 
is_in the direction opposite to the x-axis and = gn a circle of radius r = 1 
C,B,3 is opposite to the y-axis (Fig.). In the first 
quadrant the definitions already given of sine, cosine, tangent and cotangent, say from the 
triangle OC,B,, are valid. It is agreed that the same definitions shall remain valid for all qua- 
drants, that is, for all positions of the point B;: 

Table of signs 


Function 


sages ordinate coe = abscissa IV 
radius ’ radius ° sin 
fans ordinate se abscissa ni P 
abscissa ° ordinate eh 


cot p | + 


In these definitions the abscissae and ordinates have different signs in different quadrants, but the 
radius is always positive. Thus, in the figure sin v2, tan 93, cot y3 and cos 4 are positive, but cos g2, 
tan 92, Cot Y2, SING3, COS H3, Sin g4, tan y4 and Cot gy, are negative. The table shows the signs of 
the four trigonometric functions in all the quadrants. 

The procedure described for extending the domain of validity of a definition to a new region 
(the quadrants I, II, III, IV), in such a way that the relations holding in the original domain of 
definition (quadrant I) remain valid, is often used in mathematics (the principle of permanence). 
In particular, all the relations between the trigonometric functions that were found when they 
were first introduced now hold for all values of the angle 9. 

The trigonometric functions can also be determined for the angles 0°, 90° (7/2), 180° (x), 270° 
(3/2) and 360° (27), since the abscissae and ordinates for these angles have one of the values 0, 
+1 or —1. Discontinuities occur for the tangent and cotangent functions if the denominator of 
the fraction tends to zero. For example, if the angle y approaches the value 90° from below, then 


B —- — 3 
lim taney = tim. 221) = 40: Angle | 0° | 90° | 180° | 270° | 360° 
$1 +90° Joc,j+0 |OC;| 0 n/2 |x 3n/2 | 20 
ae SS 


on the other hand, if g approaches 90° from 


sin @ —1 bel 

he CoB, cosp | +1 0 0 aay 
lim tang,= lim ——* =—oo. tan p I 0 0 
2-7 90° |OC;|+0 —!0C,| cot p +co 0 +00 +00 


224 10. Trigonometry 


As the increasing angle passes through the value g = 90°, the value of the tangent function jumps 
from -++0o to —oo. For gy = 90° itself the function is not defined. The notation for this situation 
is abbreviated to tan 90° = +co. Similar jump discontinuities occur for the tangent function at 
gy = 3n/2 and for the cotangent function at g = 0 and at » = 2. Since the radius of the unit 
circle has length r = +1, the sine and cosine are given by the ordinate and abscissa, respectively 
(with the appropriate sign). The tangent and cotangent functions can also be expressed as the ratio 
of two line segments whose denominator has the value 1. The value of the tangent can be read off 
as the signed numerical value of the directed segment intercepted between the arms of the angle » 
on the tangent (x = 1) to the unit circle at the point By. For, by the intercept theorems (Fig.): 


10.1-10 The trigonometric 
functions in the four 
quadrants 


sin Sin gs” G +. Bob; sin SIN Pa ra BoD, 
“== 7 ab = OTe 
tang; = ee: ae m( 3), tangy= — he : 7 : 


Similarly the value of the cotangent can be read off as the signed numerical value m of the directed 
segment intercepted on the tangent to the unit circle at the point F (y = 1) by the positive y-axis 
and the moving arm of the angle gy. Again, by the intercept theorems: 


= m(BoD,). 


COS D3 at = aoe! 3 AFE COS D4 —y OE, 
——_ = — ) cot —_— = > = = m( FE. ). 
sings C3B, E3E; j sip aer Pe C4 Bg Exes : 


cot p, = 


The terms tangent and cotangent can accordingly be visualized as the signed numerical values of 
segments on the tangent (with point of contact at x = 1) and the cotangent (with point of contact 
at y = 1). The values of the functions for the angles 0, 2/2, 2, 32/2 and 27 can also be read off from 


the unit circle. 


10.1. Trigonometric functions 225 


Graph of the angle functions in the 
four quadrants. A clear picture of the 
shape of the graphs of the trigonometric 
functions can be obtained by introduc- 
ing a Cartesian coordinate system in 
which the argument g in radians is 
taken as abscissa and the values of the 
respective trigonometric functions are 
taken as ordinates. The figure shows 
the pointwise construction of the graphs 
of the functions sine and tangent for 
angles at intervals of 15° (x/12 or 
167/38) in the quadrants I and II. In 
the following figure the scale is reduc- 
ed by half in order to show the graphs 
of the curves for all quadrants. 


Properties of the trigonometric 
functions 


From the two figures showing the 
graphical representation of the tri- 
gonometric functions a number of 
properties of these functions can be read 
off, whose validity can usually be prov- 
ed from the unit circle. The angles 
can have arbitrary positive values and 
also, as will be seen, arbitrary nega- 
tive values. 


10.1-11 Construction of the graphs of the 
trigonometric functions 


Periodicity and range of values of the 
trigonometric functions. The trigonome- 
tric functions are periodic. The sine and 


II. ns I I. I. i's the cosine functions have the period 2x 
| | i | tanp| | : (360°); the tangent and the cotangent 
| | | NY | i functions have the period 2 (180°). 

cotp | | In the unit circle the free arms of the 
| J Vv | i angles (g + 2mm) all have the same 
\ | position, and thus their trigonometric 


functions have the same values. 


The unit circle representation shows fur- 
: aug | ther that the free arms of all the angles 
= | | (p + nx) cut each of the tangents to the 

/ \ 9 / \ / y | unit circle at the points where x = 1, 
y = 1, respectively, at a single point, and 

| that the tangent and cotangent functions 

| | a | | of these angles therefore all have the same 


| | | " | | | | value. 


10.1-12 Graphical representation of the trigonometric | 
functions in the four quadrants (arguments in radian 
measure) 


226 10. Trigonometry 


The functions sine and cosine take all their function values in a subinterval, for example, for 
0 <9 < 2a, and the functions tangent und cotangent take all their values in a smaller interval, 
for example, for 0 < g < 2. In such an interval the functions sin g and cos 9g oscillate between the 
values —1 and +1; on the other hand, the functions tan 9 and cot @ take all values between —oco 
pus Sask 


Slope of of the tangent. According to the niles of the differential eaieulu the derivative of a function 
at each point of its graph gives the slope of the tangent to the curve at this point. 


dsing _ dcosp_ _—ssis. dtanp 1 dcotp | 1 
ig = cos 9, oo sin 9, = =—— : 


Thus, the sine and tangent curves cut the g-axis at the point g = 0 at an angle of 45°, since 
| d sin y | -| d tan p 
dp Jo-=0 dy 
below, and the tangent curve above, this common tangent, because cosqg is decreasing for 
this angle. At the point g = 2 the two curves are perpendicular to one another. For g = 7/2 the 
cosine and cotangent curves have a common tangent which makes an angle —45° with the positive 


; j=? | ee | 
g-axis ; = | 
dp = jg=n/2 dp  jg=n/2 

moves away above, and the cotangent curve below, this common tangent. For g = 3z/2 these 
curves intersect at a right angle. The sine curve has a tangent parallel to the y-axis at pg = 2/2 and 
at » = 3z/2, and the cosine curve at p = 0 and 9 = 2. 

The progress of the tangents to the sine and tangent curves in quadrant I establishes the estimate 
sin gy < arcg < tan g, where arc 9 is the radian measure of the angle g. In the graph arc@ is represent- 
ed by a straight line at an angle of 45° to the positive y-axis. 


Even and odd functions. The function cos g is even, because for positive and negative angles » 
it has the same values; f(—q) = f(g) (Fig.). The functions sine, tangent, and cotangent, however, 
are odd functions. Their curves are symmetrical about the origin (Fig.), because f(—¢) = —f(9), 
so that the function values for positive and negative angles have the same absolute value but opposite 
signs. The validity of these much used relationships can be seen from the unit circle. 


= -+1. But as the angle increases, the sine curve moves away 
g=0 


= —]. As the angle increases, the cosine curve 


10.1-13 Graphical representation of the 3 petreherteretceersrererrepreceererspresrerectpestetemt ; 
even function y = cos ¢ = cos (— 9) at | : ERED S533 SESE £2 


10.1-14 Graphical representation of the — 
odd function y = sin g = —sin (—¢) 


Because of these properties it is sufficient to know the function values in a subinterval of half the 
period length in order to give the values for the whole interval. For example, for angles from 0 to x 
the cosine runs through the same values as for the angles from 27 to; in symbols, cos y = cos (27 — 9), 
g < x. Thus, it assumes all its values between 0 and z. Similarly ‘for the three odd trigonometric 
functions the values in a subinterval are sufficient: from 0 to x for sin g, and from 0 to 7/2 for tan » 
and cot 9. 


10.1. Trigonometric functions 227 


From the relations between a func- 
tion and its cofunction (Fig.) and 
those between the quadrants it suf- 
fices to know the values of sing 
for O<gm<2/2 in order to calcu- 
late the values of all the other tri- 
gonometric functions. To simplify the 
calculations it is of practical con- 
venience to give the values of the 
tangent together with those of the 
10.1-15 Sine, cosine, tangent and cotangent ofa negativeangle sine forO<g < 2/2. 


Relations between the trigonometric functions for the same angle. From the relations found in the 
introduction, each trigonometric function can be expressed in terms of every other one for the 
same argument. For example, if one wishes to express sin y or cot g in terms of cos 9, one obtains: 

1. sing = +V(1 — cos*g); 2. cot y = cosp/sin gy = cos y/[+YV(1 — cos? ¢)]. 
The following table contains all the relations: 


For angles in the first quadrant the positive signs of the roots are valid. In the remaining quadrants 
the signs of the roots are determined from the table of signs or from the unit circle. 


Example: In the third quadrant cos g and sin ¢ are negative, but tang and cot ¢ are positive. 
Hence for 2 < gy < 3/2 the second line of the table is 


1 = cot p 
V+tan?g) = (1+ cot? ¢g) ~ 


Function and cofunction. The word cosine means complementary sine, that is, the sine of the 
complementary angle. Similarly cotangent and cosecant, respectively, mean tangent and secant of 
the complementary angle. The complement f of a given acute angle « is such that « + # is a right 
angel. The right angle can be measured in degrees, gons, or radians. Thus, if « and # are in radians, 
then « + 8 = 2/2. The expressions cosine, cotangent and cosecant therefore imply the mathematical 
statements 


cosy = —/(1 — sin* g) = — 


cos « = sin (x/2 — «) = sin B, 
cosec « = sec (1/2 — x) = sec PB. 
cot « = tan (7/2 — «) = tanP, 


In a right-angled triangle in which a and b are the sides opposite the angles « and f, respectively, 
it is immediately obvious that 
sina = a/c=cosB, cosxa=b/e=sinB, tana =—a/b=cotB, cota = b/a=tanP. 


In addition one sees that sin « = cos (7/2 — «), tana = cot (m/2 — «), seca = cosec (1/2 — «), 
so that the sine function is the cofunction of the cosine, and the tangent and secant, respectively, are 
the cofunctions of the cotangent and cosecant. 


Each trigonometric function assumes for the argument increasing from 0 to n/2 the same values 
as its cofunction for the argument decreasing from x/2 to 0. 


228 10. Trigonometry 


Quadrant relations. Between trigonometric 
functions whose arguments differ by a right 
angle or a multiple of a right angle, certain 
relationships hold, the so-called quadrant rela- 
tions. 

The quarter turn theorem. Passage from one 
quadrant into the next follows from the rota- 
tion of the figure through a quarter of a com- 
plete rotation, or from the addition of a right 
angle to the argument 9 (Fig.). In this rotation 
the abscissa and cosine value, respectively, be- 
come the ordinate and sine value of the same 
absolute value, and conversely |sin(x/2 + 9)| 
= |cos g|, |cos (x/2 + ¢)| = |sing|. A_ positive 
cos @ lies along the positive x-axis and thus, as 
sin (7/2 +- ¢), lies along the positive y-axis after 
the rotation, hence is positive. Similarly a nega- 
tive cos y on the negative x-axis, after a rotation 
through a right angle, goes into a negative value 
sin (7/2 +) on the negative y-axis. Hence 
cos y = sin (z/2 + gy). On the other hand, a 
positive value sing goes from the positive y- 
axis to the negative x-axis under the rotation, 


10.1-16 The quarter turn theorem 


and a negative sing from the negative y-axis gin (x/2 + 9) = cos 9, cos (x/2 + 9) = —sing 
to the positive x-axis. Consequently, sing 
= —cos (x/2 + 9). 


In the coordinate system the radius of the unit circle has the same position for a positive angle o 
as for a negative angle —y if » + y = 4/2 = 2x. Hence the trigonometric functions also have the 
same value if p is replaced’ by —y: sin (x/2 — y) = cos (—y) = Cos y, Cos (2/2 — y) = —sin (—y) 
= siny. Because tan gy = sin g/cosg and cot y = cos 9g/sing, it also follows that tan (7/2 + 9) 
= —cot 9, cot (x/2 + g) = —tang, tan (x/2 — y) = cot y, cot (x/2 — y) = tan y. 

In this way the relationships between function and cofunction have been generalized for an ar- 
bitrary angle y. Geometrically the equation sin (7/2 +- 8) = cos B means that the curve of the sine 
function may be regarded as that of the cosine curve translated by 90° = 2/2. 

Further quadrant relations. Passage to the next quadrant but one is accomplished by the addition 
of 2/2, and to the next quadrant but two by the addition of 37/2. The relations that hold in these 
cases are obtained by substituting 7/2 + y or 27/2 + y, respectively, for g in the quarter turn 
theorem. 


Examples: 1. tan (2/2 + y) = tan(a/2 + 2/2 + py) = —cot(x/2 + y) = +tany. 
2. cos (31/2 + y) = cos (m/2 + 2n/2 + y) = —sin (2n/2 + y) 
= —sin (x/2 + 2/2 + y) = —cos (m/2 + y) = +siny. | 
One can also apply the formulae of the quarter turn theorem to a multiple of a right angle minus 
an arbitrary angle ¢. 
Example: sin (2n/2 — gy) = sin (1/2 + 2/2 — gy) = cos (x/2 — ¢) = sin 9. 


Summary of the quadrant relations. By the quadrant relations a trigonometric function of an angle 
(nz/2 + 6) for n = 1, 2, 3, 4 can be expressed as a function of the angle 6, where 6 is an arbitrary 
angle. Putting 2 i 6) = ®, the aS anectan relations can be collected os in the following table: 


The table shows the following rules: 
1. when the angle 0 is added to, or subtracted from, an odd multiple of 2/2, that is for ® = 2/2 + 6 
or ® = 3n/2 + 6, the cofunction of 6 occurs in the expression for the required function of ®; 
2. when the angle 6 is added to, or subtracted from, an even multiple of 2/2, that is, for ® = 2 + 8 
or ® = 2x + 6, the same function of 6 occurs in the expression for the required function of ®; 


10.1. Trigonometric functions 229 


3. if 6 is taken to be an acute angle, then the free arm of the angle @ is in the quadrant given in the 
last line of the table. The sign of the function of ® is determined for this quadrant from the unit 
circle. 


By means of this table the trigonometric functions of an arbitrary angle ® can be related to func- 
tions of an acute angle 0. Since in practical problems, especially if the angles are given in degrees, 
minutes and seconds, only positive increments are used, the columns © = z/2 + 6, ® = 2n/2+.6 
and ® = 3n/2 + 6 are specially marked. From the table, for example, sin (27/2 + 6) = —sin 6, 
tan (7/2 + 6) = —cot d, cot (37/2 + 6) = —tan 0. 


Inverse functions. If a direct- 


ed segment OF, (or OF3) of 
length less than unity is marked 
off from the origin on the y-axis 
(Fig.), then a line parallel to the 
x-axis through the point F, (or 
F;) cuts the unit circle in two 
points B, and B (or B3 and Bg). 
The figure shows that OB, and 
OB., respectively, are the free 
arms of two angles 9, and 92 
whose sine functions are given 
by the signed numerical values 
of the segment OF,, sing, 
= sin p2 = m(OF;). 

Similarly OB, and OB4, res- 
pectively, are the free arms of = 
angles g, and y4, where sin 93 = sin y, = m(OF3). Thus, every number y with |y| < 1 can be 
expressed as the value of a sine function, and if |y| < 1, there are always two angles that satisfy 
the equation y = sin y. Of course, it is clear from the figure that complete rotations of the free 
arm cannot be distinguished; thus, there are actually infinitely many solutions 

Yi = 9, 2mm and y2=92+2nn for y>O, 
WY, =G93 2mm and y2=—%, 2mm for y< 0, n=0,1,2,... 


From the symmetry about the y-axis, these angles satisfy the equations 
PitG2=2%, G3 t+ G4 = 3x. 


Similarly, by constructing parallels to the y-axis through the end points C, 4 and C2,3 of directed 
segments on the x-axis of length x with |x| < 1, one obtains two angles gy, and 94, 2 and 93, 
respectively, as solutions of the equation cos y = x (Fig.). 

¥,=9, 2mm, Y2=Get2nmn, n=0,1,2,... 
or Y¥1=G2t2m7, Y2=G3t2mn, n=0,1,2,... 


These two solutions satisfy the condition 9, +94 =—22, or 92+ 93 = 2n. 


From the value of the cosine or of the sine one obtains two values of the angle y in the interval 
0<9< 2a; from y = sin y, for example, the values g, and g,. From the unit circle one sees that 
the angle y is uniquely determined by one of these functions 
and the sign of the other. From the relationship tan y/2 
= sin y/(1 + cosy), which will be derived in connection with 
the addition theorem, it follows that the value of the tangent of 
the half angle is sufficient to determine uniquely the angle 
y,O<y< 2n. 

The problem of finding angles for which the tangent function 
or cotangent function assume the given values y or x, respec- 
tively, can likewise be solved geometrically on the unit circle 


(Fig.). The directed segment ByD, 3 corresponding to the number 
y is marked off on the tangent at Bo (x = 1) to the unit circle, 
and the line joining the origin to its end-point D, , meets the 
unit circle in the points B, and B;: one sees that y, 
= 9, tna, n=0,1,2,..., are solutions _, of the equation 
tan y = y. Similarly the directed segment FE, 4 corresponding 


10.1-17 Construction ofthe angle 10.1-18 Construction of the angle 
for two given values of the sine for two given values of the cosine 


to the number x is marked off on the tangent at F (y = 1) to 
the unit circle, and the line joining the origin to its end-point E2 4 
meets the unit circle in the points B, and By; y2 = 92 + m1, 


10.1-19 Construction of the angle 
for one given value of the tangent 
and of the cotangent 


230 10. Trigonometry 


n= 0,1,2,..., are solutions of the equation cot y = x. A function that determines the angle, 
measured in radians, for which a trigonometric function assumes a given value is called a cir- 
cular or inverse trigonometric function (see inverse functions in Chapter 5.). 

After the usual interchange of x and y in the derivation of the inverse function, y represents 
the angle (in radians) whose sine has the value x. The Latin phrase arcus cuius sinus x est (the arc 
whose sine is x) has led to the symbol arcsin x. The notation for these functions is collected together 


Se ee eS 


— Si ee 


MM ma eE-  — —— —— —_ 


SEW SSS SS iS SS i — Ee a a a ee 


ne 
yeorctanx ‘ 
yeAresinx + 7 | 
2 ) aa — “y= OPeCOSx 
“1 oe - A , 
=Arccotx 
a aghloatl | 
= 0 7 x | 
yeArctanx 
= Sa Ss 
2 
i 
10.1-21 Graphical representation of 
the functions y = arcsin x and 
y = arccos x 


i eS eS es = <a 5 ——= os = 


10.1-20 Graphical representation of the functions y = arctan x and y = arccot x 


in the table. By taking the mirror image of the graph of a trigonometric function in the angle bisector 
of the first quadrant one obtains the graph of the inverse function (Fig.). There are different ranges 
of values of the inverse function corresponding to the intervals in which the function is monotonic. 
The principal values are denoted by Arcsin x, Arccos x, Arctan x and Arccot x, where 


—n/2 < Arcsin x < +2/2, O< Arccosx< +a, —n/2 < Arctan x << +2/2, 0< Arccot x< +2. 


In using the notation y = sin-! x care must be taken not to confuse it with the reciprocals of the 
trigonometrical functions, for example (cos x)~! = 1/cos x = sec x. 


Working with trigonometric tables 


In times before computers became widely used it was essential for trigonometric calculations 
to use numerical tables. Also today it may be useful to know how to use them. The principle of 
the arrangement and use of such tables is the same for all methods of subdividing the angle. Since 
decimal subdivision of degrees is now widely used, the explanation is based on it. 


Looking up the angle functions. In the table reproduced here the values of the function represented 
are found at the intersection of a horizontal row with a vertical column (Fig.). In the figure, for 
example, where the row for ‘5’ degrees meets the column for ‘0°.4’ one finds that sin 5.4° = 0.0941, 
as the heading indicates. Since the values of the sine are all less than 1 (except for the single value 


NATURAL SINES 


0035 0052 


g2 0209 0227 | | 
0366 0384 0401 0419 toe 
0541 0558 0576 0593 rehhe 


0732 0750 


“Toss ToeO- 7 
1236 1358 1271 
4409 1426 1444) 
ae9o 4599 1616 


10.1-22 Looking up sin « when « is given 


10.1. Trigonometric functions 231 


1 = sin 90°), frequently only the places after the deci- 
mal point are given. All tables of trigonometric func- 
tions have a double entry; they can be read from the 
left and above or from the right and below. This 
means that the rows can be counted from top to 
bottom and the columns from left to right, or con- 
versely, the rows from bottom to top and the co- 
lumns from right to left. In the figure the value sin 5.4° 
= 0.0941 appears in row 5 and column 0°.4, counted 
from the left and above. From the right and below 
the same value appears in the row 84 and the column 
.6. But because 5.4° + 84.6° = 90°, this gives the 
value of the cofunction, that is, cos 84.6° = 0.0941. 
In this way the values of a function and its cofunc- 
tion are contained in one and the same table, the 
sine and tangent values from the left and above, and 
the cosine and cotangent values from the right and 
below. By the quadrant relations only values in the 
first quadrant need be given; the sign for the requir- 
ed function is determined from the table of signs 
or from the unit circle. 


Because the use of calculating machines is growing, the importance of tables of the natural values 


of the angle functions is increasing. Formerly logarithmic 
trigonometric tables were preferred for accurate calcula- 
tions. In some of these the characteristic is increased by 
10. For example, -lg sin 5.4° = 0.9736 — 2 = 8.9736 — 10, 
and such a table gives 8.9736. Since there are no loga- 
rithms of negative numbers, these tables contain only the 
logarithms of the absolute values of the trigonometric func- 
tions. On the other hand, their signs are decisive for the mag- 
nitude of the angles to be calculated; the signs can then be 
indicated by a p (positive) or nm (negative) placed after 
the logarithm (see examples of calculations with logarithms 


in Chapter 2.). 


Example J: For the angle gy, = 56.6° one obtains: 


cos 56.6° = 0.5505: 
cot 56.6° = 0.6594: 


sin 56.6° = 0.8348; 
tan 56.6° = 1.517; 


lg sin 56. Re = = 9. 9216: lg cos 56.6° = 9.7407; 
lg tan 56.6° = 0.1809; lg cot 56.6° = 9.8191. 
Example 2: For the angle p, = 
obtains: 
sin 113.4° = sin (90° +- 23.4°) = +-cos 23.4° 
= +40. 9178; 
cos 113.4° = cos (90° + 23.4°) = —sin 23.4° 
= —0.3971; 


tan 113.4° = —cot 23.4° = —72.311: 
cot 113.4° = —tan 23.4° = —0.4327: 

lz sin 113.4° = 9.9627p; 
Ig |tan 113.4°| = 0.3638n; 


Example 3: For the angle gp, = 244.8” (Fig.) one 
obtains: 
sin 244.8° = sin (180° + 64.8°) = —sin 64.8° 
= —(,9048; 
cos 244.8° = cos (180° + 64.8°) = —cos 64.8° 
= —0.4258: 
tan 244.8° = tan 64.8° = +2.125: 
cot 244.8" = cot 64.8° = +0.4706; 
g |sin 244.8°| = 9.9566n; 


Ig tan 244.8° = 0.3274p; _— ig cot 244.8° 


113.4° (Fig.) one 


Ig |cos 113 4°| = = 9.5990n; 
Ig jcot 113.4°| = 9.6362n. 


Ig |cos 244.8°| = 9.6292n; 
= 9.6726p. 


10.1-23 Values of the trigonometric 
functions for the angle ¢, = 113.4° 


10.1-24 Values of the trigonometric 
functions for the angle vs = 244.8° 


232 10. Trigonometry 


Example 4; For the angle yg, = 320.3° (Fig.) one 


obtains: 
sin 320,3° = sin Ce. "E 50.3°) = —cos 50.3° 
= —0,.6388 
cos 320.3° = COs (270°. + 50.3°) = +sin 50. 3° 
= +0.7694; 
tan 320,3° = —cot 50.3° = —0.8302: 
cot 320,3° = —tan 50.3° = —1,205; 
Ig |sin 320.3°| = 9.8053n; Ig cos 320.3° = 9.8862p; 


Ig |tan 320.3°| = 9.9192n; _Ig |cot 320.3°| = 0.08087. 


10.1-25 Values of the trigonometric functions for the angle 9, = 320.3° 


If a further decimal digit z in the value of the angle is known, then the required function value 
lies between two values ft, and f, given in the table, and by linear interpolation on the table difference 
d= = t2 — ty one obtains the correction c= = d- 2) 10 (see Chapter 2.). 


decret =p ehaaly: cheep ah ae the correction ¢ 


“The value of s sin ns. 47° lies bee 0. 0941 and 0. 0958, The abi difference dis 17-10-* = 0. 0017 
and the next digit z is 7. For the correction one obtains c = i me -10-* = (11.9) - 10-* & 12- 10-* 
= 0.0012; that is, sin 5.47° = 0.0953. The value of cos 56.64° lies between 0.5505 and 0.5490; 
here d= —15-10-*,z=4,c= as, 10-* = —(6.0) - 10-*, that is, cos 56.64° = 0.5499. 


10 
Examples: 1. tan 113.43° = —cot 23.43° = — (2. 311 — a 10") = —2,308. 
2. Ig |cos 244.86°| = Ig cos 64.86°n = (1.6292 ae ete 10-*) n = 1.6282n. 
3. Ig |sin 320.39°| = Ig cos 50.39°n = (9.8053 — aia - 10-*) n = 1.8045n. 


4. cot 81.36° = 0.1519, because cot 81.3° = 0.1530 and cot 81.4° = 0.1512. 
5. tan 62° 37’ = 1.931, because tan 62° 30’ = 1.921 and tan 62° 40’ = 1.935. 


Looking up the angle. If the given function value appears directly in the table, then the deter- 
mination of the angle is just a matter of reading off the numbers of the row and column that inter- 
sect at this value. If the function value does not agree with any value in the table, then one obtains 
the next decimal digit z of the angle from the table difference d and the correction difference c 
between the function value and the nearest value in the table in the sense of increasing argument: 
c/(d/10) = z or z= cc: 10/d 


Example 1: The cosine value 0.3950 lies between cos 66.7° = 0.3955 and cos 66.8° = 0.3939; 
one finds that d= —16-10-*, c = —5-10-*, so that z= a 


= 66.73°. Because the inverse function is many-valued, ¢ = —66.73° = 293.27°: 66.73° + n- 360° 
and 293.27° + n- 360°, mn = 1, 2, ... are also valid solutions. 

Example 2: What is the value of arcsin (— 0.7777)? — From the table of signs or from the represen- 
tation in the unit circle there exist, when the period is ignored, two solutions m, and @ in the third 
and fourth quadrants satisfying the equation | G1 + $2 = In = 540°. From the quadrant relations 
6 = g, — 2/2 lies between 51.0° and 51.1°, since sin 51.0° = 0.7771 and sin 51.1° = 0.7782; 
from d= 11-10-*, c=6-10~* it follows that z= °° 5, that is, 6 = 51.05°, 
P, = 231.05° + n+ 360°, m2 = 308.95° + n+ 360°, n= 0, 1, 2, . 

Example 3: What is the value of arctan (—2.000)? — The ciate gy for which tang = —2,000 
lies in the second quadrant and for 6 = g — 2/2, cot 6 = 2.000. Ar cot 26.5° = 2.006 and 


cot 26.6° = 1.997 it follows thatd = —9- 10-*,c = —6-10-3,z= a JF 7, that is, 6 = 26.57°. 
Hence arctan (—2.000) = 116.57° + n- 180°, n = 0, 1, 2,. 5 
Example 4: Which angles 9 satisfy the equation lg |cos i = 1.74435n? — Because the value of 


the cosine is negative, the free arms of the angles m, and @; lie in the 2nd and 3rd quadrants, sym- 
metrically with respect to the x-axis. By the quadrant relations 6 = @,; — 27/2 is determined by 


ev 3, or arccos 0.3950 


10.1. Trigonometric functions 233 


Ig cos 6 = 1.74 435p. From a 5-figure table one finds that lg cos 56.28° = 1.74 440 and lg cos 56.29° 


= 174428; fron d= —12-10-8, c= —5*10-%, z= = mi & one obbiina:d = '$6.294°. 
Hence pz = 123.716° + n+ 360°, ps = 236.284° + n- 360°. 


The addition theorems 


The addition theorems show how the trigonometric functions of a sum or difference of two angles 
« and B can be expressed in terms of the trigonometric functions of the individual angles. 


The addition theorems for sine and cosine. In the unit circle the values of the cosine and sine, 
respectively, of an angle ¢ are represented as the signed numerical values of the abscissa and ordinate 
of the radius in the direction of the free arm of the angle y. The two segments are the orthogonal 
projections of this radius on the x- and y-axes. By a theorem of vector algebra, the projection of 
this radius is equal to the sum of the Projections of any two vectors of which it is the sum. In the 


_—> 
Fig. 10.1-26, for example, in each case OQ = OT + TQ, where OT is the orthogonal projection of 
> > > 
the free arm OQ of the angle f on the free arm OP of the angle «, and TQ is the orthogonal projec- 
— 
tion of the free arm OQ of the angle 8 on a direction S which is the y-axis of a second Cartesian 


10.1-26 Examples of the addition theorem for the sine and the cosine function 


_ 
coordinate system (X, 9) determined by the X-axis in direction of OP and x (x, ¥) = 2/2. Between the 
systems (x, y) and (x, ¥) hold the angular relationships 


XG@H=a, K@AN=ata/z, XO, = —aAf2t+a, XO, = —A/2P+ratafr=a. 
If mz denotes the signed numerical value of the orthogonal projection on the X-axis and m; y, that of the 


orthogonal projection on the y-axis, than OT = mz (00) = cosf and TC Q= m=(OQ) = sin B. 
i aaa in the (9 y)-system are valid: 


m0) = m,(OT) + m,(TQ) = cos(« + 8) and 
m,(O0) = m,(OT) + m,(TQ) = sin (« + B). 
Because of 
m,(OT) = cos f cos (x, X) = cosa cos, 
m,(TQ) = sin B cos (x, 9) = —sin« sin B, 
m,(OT) = cos B cos (y, X) = sin « cos B, 
m,(TQ) = sin B cos (y, ») = cosa sin 
it follows cos (« + 8) = cosacos B — sina sin # and sin (« + £) = sina cos B + cos« sin PB. 


These arguments are valid for arbitrary angles « and f; the three figures are examples for three 
selected cases. 


If at the same time one uses the fact, obvious from the periodicity, that every angle 8, can be replaced 
by an angle —f,, where £; + 8, = 22 (or 400® or 360°), it follows that differences of angles can 


234 10. Trigonometry 


also occur in the addition theorem: 
sin (« — B) = sin « cos B — cosa sin f, 
cos (a — 8) = cosa cosP + sin« sin B. 


The addition theorems for tangent and cotangent. These are obtained at once in a universally valid 
form by division and suitable rearrangement: 


_ sin(a+ 8) _ sinacosB + cos« sin B 
i Call cos(« +B) cosa cos f — sina sin B * 
Both the numerator and the denominator are divided by cos « cos f: 
_ tana+tanB _ tana — tanB 
a oa as 1 — tanatanp ’ a 1+ tanaetanB ° 
Similarly one obtains 
ered p= cot « cot B — 1 top cot « cot B+ 1 


cota-+cotB ” cot B — cota 


sin (« — 8) = sina cos 8 — cos « sin B 
cos (a — 8) = cosa cos f + sin« sin B 
tana — tanf 
— EEE ee 
ene B) 1 + tana tan Pp 
_ cota cot B+ 1 
cot B — cot « 


sin (« + 8) = sina cos f + cos« sin PB 

cos (« + 6) = cosa cos f — sinx sin 
\ a. tana+ tang 

ane P) = 1 — tan« tan B 

cot a cot B — 1 


et 2) — cot 8 + cot « 


cot (x — f) 


Functions of double and of half angles 


sin 2p = 2 sing cos p 
sin p = 2 sin p/2 cos p/2 


cos 2p = cos? g — sin? py = 1 — 2sin* mg = 2 cos? pm — 1 
cos p = cos? ¢/2 sin? g/2 = 1 — 2 sin? w/2 = 2 cos? g/2 — 1 
| _ 2tang 2 7) Ztang/2 2 
She St = tan?g cotg — tang Oe = he tant g/2~—s cot p/2 — tan g/2 
cot?~—1  cotg— tang Det ee cot?7g/2—1  cotg/2 — tan g|2 
SZeotgi 7) 2 YF cot ot 2 


sing = + |/(—=** ) sin 9/2 = LV(- a 


(—=s 29° 1 
cos p = +y| Saeed cos @/ _— + |/(=*) 


l — cos 2p 
l+cos29  sin2o 


ae at) 
eng/e = “V5 l + cos@ 


Ra 1 + cos 29 sin2p 1+ cos29 
oe +/(7=2)- l1—cos2g  _sin2@ 
: I + cos@ | sin p I] +- cos @ 
= -+ SS a SSS SSS SS 
St ge = 1/( 1 — =a 1 — cos sin 
2 tan p/2 | _ 1— tan’ 9/2 
1 + tan? @/2 ’ ae See tant g/2 


cot 29 = 


~ 
ung = +/( cos x) = sin 2p = 


1 + cos 29 


1 + cos@ 


—, Cc = : ;sing= 
P | 7 


Functions of multiple angles 


| sin 3p = 3 sing — 4sin°* p 
sin 49 = 4sing cos m — 8 sin* pcos 


sin Sp = 5 sing — 20 sin* p + 16 sin’ ¢ 


3 tang — tan* p 
1 — 3 tan? 
4 tang — 4tan? 
‘1 — 6tan? @ + tan*@ 


tan3g = 
tan4g = 


cos 3p = 4cos* g — 3cos@ 

cos 4p = 8 cos* m — 8 cos? p+ 1 

cos 5p = 16 cos* p — 20 cos* pm + Scos@ 
__ cot? g — 3cotp 

Ns Toe 3 cot? p — 1 


is 4 cot? g — 4cot p 


10.1. Trigonometric functions 235 


Consequences of the addition theorems 


From the addition theorems many relationships between the trigonometric functions can be 
derived, and they are collected together in the following tables. A few examples illustrate the way 
in which they are derived. 


sin (« + £) sin (« — B) sin 3p = sin (29 + ¢) 

= sin? « cos? B — cos? « sin?B = sin 29 cos + cos 29 sing 

= sin? « cos? B — cos? «(1 — cos? f) = 2 sin g cos? p + (1 — 2 sin? ¢) sin 
= cos? B(sin? « + cos? «) — cos? « = 2sin g(1 — sin? g) + sing — 2 sin? 
= cos? B — cos? a. = 3sing — 4sin° ¢. 


In the equation sin (m + y) + sin(g — y) = 2sing cosy one puts «=gy+y, B=y— y, So 
thatm = '/2(« + B),y = */2(« — B), and one obtains sin « + sin B = 2 sin '/.(« + B) cos 1/2(a— 8). 
sin « sin B sina cosB+cosasinB _ sin(« + £) 

COS & cosB cos « cos B cosa cosp ° 


tana + tan B = 


Sums, differences and products of trigonometric functions 


. . 2+ x— a+ p x — B 
sina + sin f = 2 sin 5 Eros 22 cos x + cos B = 2 cos —~— cos —5 
: 0 _ &— — & . oe 
sin« — sin 8 = 2 cos 2 FF sin =P cos « — cos 8 = —2sin re sin —> B 
sin (x + B) sin (x +") 
f = —____— cot a + cot fs = —————— 
tan + tanf cos o cos p + cot sin x sin B 
sin (x — B) —sin (« — f) 
anx — tan 3 = ————’- cot « — cot £ = —_——_——— 
tana — tanf cosa cosB B sin « sin B 


cos « + sina = //2 sin (45° + «) = 2 cos (45° — a) 
cos « — sin« = V2 cos (45° + «) = 2 sin (45° — a) 

sin (« + £) sin (« — B) = cos? B — cos* a cos (« + 8) cos (a — B) = cos? B — sin* « 
sin « sin B = '/,[cos (x — B) — cos(a + B)]_ cosa cos B = */2[cos (« — B) + cos (a + §)] 
sina cosf = */,[sin (a — #) + sin(« + f)] cos « sin 8 = '/, [sin (a + £) — sin (« — f)] 


tana + tanf tana — tan cot « + cot cot « — cot B 

tanatanp = Sane tian =— fang = tanh cot « cot § = cot tt SOF. = — —_—__\_—_—_\_—_—; 
cota + cotf cota — cot PB tana + tanf tana — tanp 
tan « + cot tan « — cot 

fan COOH en ORE Se eae ote 
cota + tanf cot « — tan B 


sin x sin B siny = 3/4[sin(« + 6B — y) + sin(B + y —a«) + sin(y + «a — #)—sinfao+ f+ y)] 
cos a cosBcosy = 1/4[cos(« + B—y)+ cos (6 + y —«)+ cos(y-+a—)+ cos(a+ f+ y)] 
sin « sin B cosy = '/,4[—cos(« + B — y) + cos (B + y — «) + cos (y-+a—f)—cos(«+f+y)] 
sin « cosf cosy = '/,[sin(« + 8B —y) —sin(B + y —a«)+ sin(y+a—)+sin(a+ 8+ y)) 


Powers of trigonometric functions 


sin? gp = 1/,(1 — cos 29) cos? mp = '/,(1 + cos 2¢) 
sin? g = 1/4(3 sin m — sin 3g) cos? m = '/,(3 cos g + cos 3g) 
sin* mp = 1/,(cos 4g — 4cos 2p + 3) cos* m = 1/,(cos 4g + 4cos 2p + 3) 


sin? m = 1/,.(10 sing — 5 sin 3p + sin 5¢) cos* p = '/,,(10 cosg + 5 cos 3p + cos 5¢) 


General formulae for the sine and cosine of a multiple angle. De Moivre’s theorem in the theory 
of complex numbers states that (cos y + i sin gy)" = cos np + i sin np. Bearing in mind that i? = —1 
this can be proved for n = 1, 2, 3, ... by the method of induction by means of the addition theorems. 
If the left-hand side is expanded by the binomial theorem, equating the real and imaginary parts 
gives 


cosnp = cos" p — a cos"~? m sin? » + (1) cos"~*  sin* p — + 


sin ng = () cos""' » sing — i.) cos"-* » sin’ g + (5) cos"-5 » sin? g — --- 


236 10. Trigonometry 


The general sine curve. In nature and technology 
the mathematical description’ of oscillations, for 
example, in high frequency technology, optics, acou- 
stics or mechanics, is based on sine and cosine func- 
tions. In these oscillations the greatest displacement, 
the amplitude a, of a sine oscillation can be different 
from 1, its wave length 4 different from 27, and the 
ordinate at the zero point different from 0. The 
function y = a sin x, for example, has the amplitude a 
(Fig.) and the function y = sin (2xx/A) the wave 
length 4, because for 0 < x <Athe argument 22x// 
runs through the values from 0 to 27 (Fig.). The 
function y = sin (x), where 7 is an integer, has exactly 
n complete oscillations in the interval from 0 to 2z, 
since A = 27/n. Finally, the function y = sin (n7x/I) 
with A =2//n describes an oscillation of which n 
waves have length 2/. 


10.1-27 Graphs of the functions y = sin x, y = 4sin x 


AD 
Nea 


10.1-28 Graphs of the functions y = sin xx and y = sin (xx/10) 


amas 


on 7 


Superposition. If several physical quantities that can be represented by oscillations act at a point, 


then the ordinates for this point are added. For example, y, = 2 sin x and y, = —cos 2x gives 
y=y, + y2 = 258in x — cos 2x (Fig.). 
y 


Y "Wyte = eSInx~-COS2Xx 


10.1-29 Graph of the function y = y, + y, = 2 sin x — cos 2x 


Damped oscillations. If an oscillating system loses energy, then the amplitude decreases. For 
example, the function a = 3e~?*/* has the value 3/e for x = 2/2, only 3/e? for x = 27/2, and so on. 
The figure shows the graph for y = 3e~2*/* sin 4x. 


Angular frequency w and phase difference y. If the time ¢ is regarded as the independent variable, 
the equation of the general sine curve has the form y = a sin(wt + 9). From the fact that for 
wt = 2x a whole oscillation is completed it follows that the time for a complete oscillation (through 
wave peak and wave trough) is t = 2x/w. This time is called the periodic time of the oscillation 
and is denoted by T. If 7 is measured in seconds, then 1/7 is the number of oscillations in one second, 


10.2. Trigonometric equations 237 


that is, the frequency f of the oscillation: f= 1/T. The angular frequency w = 2n/T = 2n(1 /T) 
= 2nf gives the number of oscillations in 27 seconds. Finally, the phase difference y is the angle 
by which the given curve /eads the sine curve (Fig.). For t = 0 the function y already has the value 
y = sing. For a negative phase difference g one speaks of lagging. For a = 1 and 9 = +2/2 the 
function y = a sin (wt + @) becomes the cosine function cos wt, that is, the cosine curve leads the 
sine curve by 2/2. If a general sinusoidal oscillation with angular frequency w is given, then a and 


_yaasinax 


10.1-30 Graph of thefunction _ 
y = 3e-2x/% sin 4x 1 
10.1-31 General sine curve 
y = asin (ot + ¢). Left, 
phaser diagram or vector 
diagram; right, curve repre- 
sentation or line diagram 


can be characterized in a phaser diagram (vector diagram) in which a is the radius of the circle from 
which the sine curve can be constructed and 9 is the angle between the phaser (vector) and the 
positive abscissa axis at the time t = 0 (Fig.). 


10.2. Trigonometric equations 


The expressions considered so far have been algebraic in T (see Chapter 4.). The notion of an 
expression will now be generalized so as to include sin 7, cos T, tan T and cot T. By equating ex- 
pressions and at the same time taking into account the range of values of the variables, new equa- 
tions are formed. In trigonometric equations with one variable, the variable x occurs in at least 
one such generalized expression. In pure trigonometric equations x occurs only in such expres- 
sions, for example, in sin (2x + 2) — ¥2cosx = 0; in mixed trigonometric equations x also occurs 
in algebraic expressions, for example, in tan x — 3x = 0. 

Trigonometric equations are transcendental equations (see Chapter 4.). There is no general 
algorithm for their solution, but they can be solved graphically, or by numerical approximation 
methods, with arbitrary precision. For certain special types of pure trigonometric equations solu- 
tion algorithms do exist. Because of the periodicity of the trigonometric functions the domain 
of the variable of a trigonometric equation is often confined to an interval whose length is a primi- 
tive period, say 0 < x < 2a. 


Pure trigonometric equations 


_ Basic type. A pure trigonometric equation is said to be of basic type if the variable occurs only 
in expressions involving one trigonometric function, for example, in sin 7, and the equation 1s 
algebraic in this expression. 


238 10. Trigonometry 


Example ]: The equation cos* (2x) = b, in which x is variable and 6 is a real parameter, is of 
basic type. Moreover, it is algebraic in cos 2x, and the substitution f = cos 2x leads to 4° = b 


with the solution ¢ = yb. From cos 2x = yb one can look up the solutions for x within the ac- 
curacy of the table. 


Example 2: The equation tan? x + ptanx + q=0 with the variable x and parameters 
p and q is likewise of basic type. It is algebraic in tan x and by the substitution w = tan x it is 
transformed into the quadratic equation u* + pu + q=0 with the solutions (tan x); 3 = 
1/,(—p + V(p? — 4q)). With the help of a table one can then find the solutions for x. 


Reduction to basic type. If the trigonometric equation contains several of the terms sin 7, cos 7, 
tan 7, cot 7, but with the same 7, then by using formulae obtained in the previous section one can 
arrange the equation so that its terms contain only one trigonometric function. The most advanta- 
geous substitution is 


2 tan (7/2) 
1 + tan? (7/2) ’ 6 
1 — tan? (7/2) 


09 a TD) : .s ; 
Example 3: 5 sin x — 3cos x = 3, 
O<x< 22. | ies 
5. 2 tan (x/2) | l\ [\ 
] + tan? (x/2) || 
1—tan*(x/2) . , 
=~? aaGh 0 27 \V/ 4JT x 
10 tan (x/2) — 3 + 3 tan? (x/2) 
= 3 + 3 tan? (x/2); “2 : , | 
tan (x/2) = 3/5, o, a 
x/2 = 30.96°: 
x = 61.92°, Bs 


The transformation of the given 
equation is not valid for X=, since 10.2-]1 Intersections of the graphs of the functions 
tan (x/2) then does not exist. A test of =», = Ssin x and y, = 3cosx +3 
the original equation by substituting the 
value x = 2 shows that x = zis a second 
solution in the given range of values of the variable. The solutions are obtained graphically as the 
abscissae of the points of intersection of the graphs of the two functions y, = 5sin.x and 
y¥2 = 3cos x + 3 (Fig.). 


sin T= Wy=Ye YR 3cosx+3 yz sin x 


—~ 


By 


Example 4: The equation a cos x + b sin x = ¢ with c? < a? + 5? can also be solved with the 
help of the addition theorem for the cosine function. One divides both sides by r = + (a? + b?) 
and puts a/[+ (a? + 5?)) = cos h, b/[+ (a? + 5?)) = sin A, tan A = b/a. The equation then be- 
comes cos A cos x + sinAsin x = c/[+/(a* + 6?)] or cos (x — A) =c/[+ (a? + 67)); x + A= 
arccos {¢/[+ V(a? +- b*)}}. The auxiliary angle h is uniquely determined by tan h = b/a. Hence 
x 1s also known (there are two solutions between 0 and 22). For the numerical values a = —3, 
b= 5,c=3 one obtains: —3cosx + S5sinx = 3, tanh = 5/(—3) = sinh/cosh. Because 
sin A > 0 and cosh < 0, A lies in quadrant II; A = 120.96°. From cos (x — h) = 3/(+34) = 
0.5145 it follows that (x — A), = 59.04° or (x — A)z = —59.04°. Thus x, = 180°, x. = 61.92°. 


If the trigonometric equation consists of expressions in only one trigonometric function, say 
cot T,, cot 72, ..., with different 7,, 7>,..., then in certain circumstances it can be reduced to 
basic type. For example, if all the 7; are integral multiples of a single term 7, this can be done with 
the help of the addition theorems. 


Z2cot2x 1 | cot? x — 1 
i—jote a or 4cot 2x = 1— 3cot x. Because Ck = 


2(cot?x—1) _ | : ! ch 
—ae = 1 — 3cot x, or 5cot* x — cot x — 2=0 (Fig.). 


Example 5: 


the equation is equivalent to 


10.2. Trigonometric equations 239 


10.2-2 Intersections of the graphs of the y y= 4 cot 2x % =1—3cotx 
functions y, = 4 cot 2x and Vs 
ye = 1-—3cotx 8 


Putting cotx =u one obtains u* — 
— u/5 — 2/5 = 0; arate) V41/10, = 
that is, 4 bey (V41 + 1)/10, uy ei 


=—(V41—1 | , 
Solutions for 0 < x < 2x : u ZN Zz. A5\ a 
| EASES 0 \/ z x an 2m x 


(cot x) = 0.7403 | (cot x), = —0.5403 


Fi 


x, = 0.9335 | x3 = 2.0662 | 6 
(53,5°) (118.4°) | 
x2 = 4.0751 X4 = 5.2078 4" Yo 


535) 2 ST): Lp 


Test: all 4 values satisfy the equation. 


The formula for cot 2x is not valid for the values 0 and 2. However, one sees at once from the 
given equation that these values are not solutions. 


Further examples show that a reduction to basic type is possible in other cases. 
Example 6: sin (2x + 2) — |/2 cos x = 0. Using the quadrant relations or an addition theorem one 
obtains —sin 2x — /2cos x = 0 or 2sinxcosx + 2cosx = 0, (2 sin x + V2) cosx=O0 | 


By testing one can Cae 
that the solutions are cor- 
rect (Fig.). 


10.2-3 Intersections of the 
graphs of the functions 

ya = sin (2x + 2) and 

ys, = V2 cos x 


Example 7: The equation cos (3x/7) + sin x = 0 can be simplified by writing sin x = cos (2/2 — x) 
and using the formula cos « + cos 8 = 2 cos [(« + £)/2] cos [(« — §)/2): 
cos (3x/7) + cos (n/2 — x)= 0. 


| et Ole i /4 — 2) = 27 . 


Since k can take the values 0, +1, +2, «.. One can feplact —k by + in the formula for x,: 

xX, =—Tn/8 + Tka/2; xz = —21n/20 + 7kxn/5. 
Tests show that all the values satisfy the equation. It should be noted that the solutions for con- 
secutive integers & differ not by 2x, but by 72/2 or 72/5, respectively (Fig.). 


y,=cosyx You ~ SIN X 


LI DN, as 
10.2-4 Inter- Pet: RY PST mn De _/ Tm x 


sections of : 

the graphs Ly sy 

of the functions — 

y1 = cos (3x/7) and y, = —sin x, the points marked red belong to x,, these marked black to x, 


240 10. Trigonometry 


Mixed trigonometric equations 


; oa trigonometric equations can be solved only by graphical or iterative methods (see Chap- 
er 


Example J: The solutions of the equation cos x — x/2 + 1.7 = 0 are the abscissae of the points 
of intersection of the curves with equations y,; = cos x, y2 = x/2 — 1.7 (Fig.). They have only 


10.2-5 Graphical solution of the equation 
cos x = x/2 — 1.7 


one point of intersection with the abscissa x» ~ 2.21. If the graphs in the neighbourhood of this 
intersection are drawn on a larger scale, the accuracy of the reading can be improved. Here one 
obtains Xo 2.209, 

—<— 


Test: cos 2.209 — 


+ 1.7 = cos 140.63* + 0.5955 = —0.5958 + 0.5955 = —0.0003. 


A closer Sie x, to the correct value is given by Newton's method for approximate 
solutions; . 
Xy = Xo —S(Xo)/f"(Xo), FS (Xo) = COS Xo — Xo/2 + 1.7 = —0.0003, 
f'(xo) = —sin Xo — 1/2 = —1.3032, x, = 2.2088. 

The approximation can be further improved by the repeated application of Newton's method. 

Example 2: The graphical solution of the equation 3 tan x — 2x = 0 by means of the functions 

= tan x, yz = 2x/3, yields the solutions x, = 0, x2 = +4.38, x3 = +7.65, ... For increasing 

vale of x the solutions approach more and more closely the odd multiples of 2/2. To every solution 
Xo there corresponds the equal and opposite solution —x ,; for tan x9 = 2x9/3 also implies that 
tan (—xo) = /3(—xo) (Fig.). 


11.1. Solution of right-angled triangles 241 


11. Plane trigonometry 


11.1. Solution of right-angled triangles .. 241 11.3. Further formulae and applications . 248 
General methods .........0c000005 241 GCOOMOCLEY 8h aie 2 sa td oh ce Se Sew ehow ed 248 
ADDIICQUONS svwickeaecsunawdeeeas 242 PRYSCS chris c Gea e kes helenae? 250 

11.2. The trigonometric functions in the Technology....... ta deerabsicrendieree 3 251 
general triangle.................. 244 Navigation 0... 0... cc ccc cece ees 252 
The formulae of plane trigonometry . 244 Trigonometric determination of 
The four main cases in the solution of ROTOVIS 5. Saha hao ak eae obras, eek aaa 253 
GIWIONBIE: ei ien ede nde deans 246 UIUC CVINE so ice 520 8 ahs a hike Gis. eles 255 


The trigonometric functions already defined make it possible to use angles to calculate unknown 
quantities in plane rectilinear figures. Angles can often be measured with less effort and greater 
accuracy than lengths. As the name indicates, trigonometry is concerned with the measurement or 
calculation of triangles into which every figure bounded by straight lines can be subdivided by 
diagonals. In this one always has in mind the use of known angles. 


11.1. Solution of right-angled triangles 


General methods 


The definition of the trigonometric functions was first given in the right-angled triangle and then 
extended to arbitrary angles with the help of the unit circle. These definitions contain all the relations 
between lengths and angles in the right-angled triangle and thus suffice to calculate all the rest 
when any two of the six quantities are given. 

When the right angle is denoted by y and the hypotenuse by c, two additional relationships in 
the right-angled triangle ABC (Fig.) are available from geometry: 


I. the theorem of Pythagoras: c? = a? 4+ 5?, 
II. the fact that each of the angles with its vertex on the hypotenuse is the complement of the other: 
a + B= 90°. 


From these relationships or by re-lettering the triangle all possible cases in which two of the 
quantities a, b, c, « and B are given can be reduced to four cases, namely c, «; c, a; a, « and a, b, 
for which the solutions will now be stated. 

I. Given the hypotenuse c and one adjaceht angle, say «: 
1. B = 90° — «; 2. sina = a/c,a = csina; 
3. cos « = b/c, b = c cosa. 
II. Given the hypotenuse c and one other side, say a: 
1. sina = a/c; 2. B = 90° — «; 
3a. b = Y(c? — a?) or with the help of the calculated angle «: 
3b. cot « = b/a, b = acot «; or 3c. cos « = b/c, b = c cosa. 


11.1-1 
III. Given an angle and the side opposite to it, say a and a: Right-angled triangle 
1. B = 90° — a; 2. cota = bla, b= acota; 

3. sina = a/c, c = a/sin« or with the help of the calculated angle : 

2a. tan B = b/a, b = a tan B; 3a. cos B = a/c, c = a/cos B. 


IV. Given the two sides a and 5 containing the right angle: 


1. tana = a/b, 2. B = 90° — «a; 3a. c = (a? + 5?) or with the help of the calculated angle «: 
3b. c = a/sin «; or 3c. c = b/cos«. 


Checks and accuracy. One usually tries to find the solution using only the given quantities. 
Auxiliary solutions with the help of quantities already calculated can be used as checks, because 
the same quantity calculated in different ways must theoretically have the same value. Another 
check is based on the theorem that the sum of the angles of a triangle is 180°. In surveying checks 
are provided for almost every trigonometric calculation. In this the permissible deviation of the 
value for the same quantity depends essentially on the tables used. In evaluating a possible deviation 


242 11. Plane trigonometry 


one must bear in mind that fora given small interval 4,9 
of an angle 9 the errors Ay in looking up values of diffe- 
rent trigonometric functions are of different magnitude. 
In the figure, for example, 43y for y = tan 9 is greater 
than 4,y for y=cosg. Of course, conversely, for a 
given small interval 4,y the value of the angle can be 
determined more accurately from the tangent function 
or from the cotangent function than from the other two 
functions. For the function y = sin g, in particular, the 
figure shows once more the dependence of the magnitude 
Ay of the interval of the function values upon the magni- 
tude of the interval of the angle values. For small values of 


11.1-3 Inclination of a 
ladder leaning against a 
11.1-2 Accuracy in working with trigonometric functions wall, A = 1.2, / = 1.5 


the angle in the neighbourhood of y = 0°, Ay is large; on the other hand, for large values in the neigh- 
bourhood of y = 90°, Ay is small. The angle g can be determined from the value found for the sine 
with greater precision in the first case than in the second. The accuracy of the check must, of course, 
be in agreement with the measured value. To calculate the angle made with the horizontal by a 
ladder of length / = 1.50 m leaning against a vertical wall at a height 4 = 1.20 m, one obtains 
sing = 1.2/1.5 = 0.8 (Fig.). The distance x of the foot of the ladder from the wall is given by 
x = Y{(1.5)? — (1.2)?} = 0.90 m. As a check x, = 1.5cosg and x, = 1.2 cot@ are calculated. 
The round valueg, = 53° taken from a 4-figure table without interpolation gives the values x, , = 0.903 
and x,, = 0.904 which correspond to the accuracy of / and A. From a 7-figure table one obtains the 
less meaningful values y2 = 53°7'48.4”, x2, = 0.9000000 and x2, = 0.899999 6. The distance of 
the ladder will hardly be measured to within 4 millimetres and certainly not to within 4 ten-thou- 
sandths of a millimetre. The result cannot be more accurate than the given values. 

To increase the accuracy in surveying, additional measurements are made and the most probable 
value is calculated by the methods of errors and least squares. 


Applications 


Length of a chord of a circle. The angle subtended at the centre of a circle of radius r by the chord 
of length s is twice the angle subtended at the circumference by the same chord (Fig.). The perpen- 
dicular from the centre M of the circle to the chord s bisects both the angle at the centre and the chord 
and forms two congruent right-angled triangles. It then follows that: sin y = s/2r or s = 2rsiny. 


11.1-5 Determination of a right 
11.1-4 Chord of a circle N angle from a hidden point 


11.1. Solution of right-angled triangles 243 


Determination of a right angle from a hidden point. From a water pipe running in a straight line 
between the villages D and E (Fig.) a perpendicular branch pipe to a village N ist to be constructed 
and a water tower is to be built on the intervening ridge. N cannot be seen from the required point Fat 
which the branch pipe leaves the main pipe, though it can be seen from D and E. The distance a = 
DE| and the angle 6 are measured. The position of F on DE is determined by the distance x = 
DF|. From the right-angled triangles DFN and EFN one obtains: |FN| = x tan 6, |FN|=(a— x)tane, 


so that xtandéd = (a— x)tane and hence x(tand+ tane)= atane, x=a ee. 
tand + tane 
For the calculation of x using logarithms this expression is transformed using the addition theorem: 


_ a sin €/COSs € _ a sin € cos 6 COS € — g £98 O sin € 

sin 6/cosé + sine/cose cos ésindcose+cosdsine)  sin(d+ 6) | 
Determination of heights. The height of a tree can be determined (Fig.) by measuring the angle 
of elevation of the top of the tree from a point A, the distance s between the foot of the tree F and 
the base S of the point of observation, and the height 42 of the measuring instrument (that is, the 

vertical distance |AS|). Then 4, = s tan y, and the actual height H of the tree is given by 

H=h, +h. =stany-+ Ap. 

Approximate methods of determining heights. 1. Instead of measuring the angle of elevation y, 
the top of the tree can be sighted along the hypotenuse of an isosceles right-angled triangle ABC 


in which the side CB is held in a vertical line by a plumb line. The angle y is then 45° and /, = s, 
H=s+ h2 ; 


11.1-7 Method of measuring 
oo it = r heights in forestry 

This method can be used only when there is enough room to choose the point of observation 
suitably. Otherwise one can employ the following method, which is usual in forestry. 

2. A rectangle ABCD (made of wood or cardboard) is held in such a position that the top G of 
the tree is sighted along the edge AB (Fig.). A plumb line suspended from the point B then cuts the 
side CD of the rectangle in the point L. The two angles marked « are equal, since the arms of one 
are perpendicular to the corresponding arms of the other and the right-angled triangles BCL and 
BEG are similar. Then |GE|/|BE| = tan e = |CL|/|BC|. If one chooses |BC| = 10” and subdivides 
the side |CD| into inches, then |CL|/|BC| = |CL|/10 is always a decimal fraction whose value is tan ¢. 
The rectangle ABCD ‘calibrated’ in this way is a disguised table of tangents, which is particularly 
simple to handle. From 4, = |GE| = s tan it follows that the height of the tree H = s tane + hy 
= s(|CL|/10) + hz. 


Determination of the altitude of the sun. From the length b of the shadow cast by a vertical rod 
of length s on a horizontal plane (Fig.) the angle g between the rays of the sun and the horizontal 
can be determined. It is called the altitude of the sun. One obtains tang = s/b or cot p = b/s. If 

the rod is of length 1 yard, then the length of the shadow 
3€ in yards gives the value of cot g immediately. 


The angle of a tip. If 
sand is transported on a 
conveyor belt, then a coni- 
cal heap or a sand tip is 
formed as it falls off (Fig.). 
Its content can be calcu- 
lated from the diameter 
d = 2r of its circular base 
and the tip angle « be- 
tween a line in the curved 
11.1-8 Altitude of the sun 11.1-9 Sand tip surface of the cone and the 


244 11. Plane trigonometry 


horizontal. V=ar7h/3, where h = r tana, so that V = (ar>/3) tana. If the vertical angle y of the 
cone is used instead of the tip angle «, then A = r cot (y/2) and V = (ar3/3) cot (y/2). 
For sand the tip angle is approximately 33° and for vulcanite about 36°. 


The angle between the plane faces of a regular tetrahedron and a regular octahedron. The regular 
tetrahedron is bounded by four congruent equilateral triangles and six edges of equal length k. 
The angle » between two adjacent triangular faces can be seen in a plane section of the tetrahedron 
containing the edge BD, bisecting the edge AC skew to BD, and perpendicular to AC (Fig.). The 
section BDM is an isosceles triangle. Its equal sides are altitudes of faces of the tetrahedron and have 
length h = '/,k V3. The height 7 of the tetrahedron is perpendicular to one of these equal sides 
and divides it in the ration |MF|:|FB| = 1:2, because the altitudes of the equilateral triangle 
ABC are also medians. In the right-angled triangle MFD, h is the hypotenuse and |MF| = h/3 
the side adjacent to the angle ». Hence cos v = '/3h/h = 1/3, » = 70°31'44”. 

The regular octahedron is bound- 
ed by eight congruent equilateral 
triangles and twelve edges of equal 
length k. The angle 2u between 
two adjacent triangular faces can 
be seen in a plane section through 
two opposite vertices E, F and 
through the midpoints M,, M, of 
two parallel edges (AD || BC) that 
are skew to the line EF joining 
these vertices (Fig.). The section 
is a rhombus of side h = '/,k 3 
whose diagonals, |EF| = k /2 and 
|M,M.| =k, bisect the angles of 
the rhombus and are at right angles 
to one another. Hence from the 
right-angled triangle M,GE it follows that the half-angle u is given by: 


cos w = */2k/C/2k V3) = 1/V3 = 4/3 V3; we = 54°44°07” or 2u = 109°28’14”. 


11.1-10 Tetrahedron 11.1-11 Octahedron 


11.2. The trigonometric functions in the general triangle 


In many cases the lengths and angles accessible for measurement do not lie in right-angled tri- 
angles. Relationships between the sides and angles of the general triangle were therefore derived. 
The most important are the sine rule and the ccsine rule. They are sufficient for every calculation. 
The cosine rule is less advantageous for calculations, especially when tables are used, because the 
formula contains a sum of squares and a product term. It can be replaced by the tangent or by the 
half-angle formula. 


The formulae of plane trigonometry 


The sine rule. Every triangle ABC (Fig.) has a circumcircle whose centre M is at the intersection 
of the perpendicular bisectors of the sides of the triangles. The sides of the triangle are chords of 
he circle and the opposite angles are angles at its circumference. If the radius of the circumcircle 
is denoted by R, then the sides can be calculated as chords of the circle: a = 2R sin «, b = 2R sin B, 
c = 2R sin y. From these one obtains for the diameter 2R = a/sin « = b/sin B = c/siny. 


In any triangle the ratio of each side to the sine of the opposite 
angle is a constant (equal to the diameter of the circumcircle). 

The sine rule. In a plane triangle the ratio of any two sides is 
equal to the ratio of the sines of the opposite angles. 


The sine rule connects opposite data. If two opposite data are 


given, then from any third datum one can calculate the opposite Est r4 
one. Given a, « and b, for example, B can be determined from ee Ay ces 
sin B/sin « = b/a, sin B = (b/a)sina«; or given b, B and y the 


side c can be determined from c/b = sin y/sin B, c = bsiny/sinB. 11.2-1 The sine rule 


11.2. The trigonometric functions in the general triangle 245 


In calculating an angle by means of the sine rule one should, of course, observe that two angles 
gy, and @2 are given by sin y, as can be seen from the unit circle. One of these angles is acute and 
the other is the difference between the acute angle and 180°; y, + 92 = 180°. One must distinguish 
in each particular case which of these angles corresponds to the given geometrical situation. 


The cosine rule. In the triangle ABC let D be the foot of the altitude A, and |AD| = q the projection 
of the side 5 on the side c (Fig.). This projection g = bcos « is positive for an acute angle « and 
negative for an obtuse angle. The segment DB ist thus of length c — q = |DB| for arbitrary values 
of «. The altitude 4, always has the length 4, = bsin«. Applying the theorem of Pythagoras to 
the right-angled triangle DBC one obtains a? = h2 + (c — q)* = bd? sin? « + c? + b? cos? a 
— 2ch cosa, or a? = 6? + c? — 2bc cosa. Corresponding relationships can be found using the 
altitudes 4, and h,. These can be obtained formally by a cyclic permutation in which a is replaced 
by 5, 6 by c and c by a; the same holds for the angles « ~ B+ y > a. 


Cc 
‘ bd} 
c NN . a 
a) /\ iS a 
Fi | | h, _ 
a 7 he , \ “a 
0 a i 6 \ Tee 
r he | ad 4 i 
i — “ j 
A pi l8a\a\ ¢ iB, ave Dh 
i) en |e a oe i 
A Y D¢-9 B e-q 


11.2-2 The cosine rule: a) for an acute-angled triangle, b) for an obtuse-angled triangle, c) cyclic permutation 


The cosine rule. In a plane triangle the square of one side is equal to the sum of the squares of the 
other two sides minus twice the product of these two sides and the cosine of the angle between them. 
When two sides and the included angle are known, the third side can be calculated using the cosine 
rule, and when three sides are known any angle can be found: 
b? + c? — gq? c2 + a2 — b? _ at +b? — ¢? 
2be ° 2ca a aa 2ab ° 


The tangent formula. Using the rule for the ratios of corresponding sums and differences and 
applying the addition theorems one can deduce: 


a sin « a—b_ sina—sinB _ 2cos[(« + B)/2]sin[(« — )/2] 
b” snB’ a+b sina+sinB ~ 2sin[(« + B)/2] cos[(« — B)/2] ° 
Dividing both numerator and denominator by cos [(« + §)/2] cos [(« — B)/2] one obtains the tan- 
gent formula for the sides a and b. The corresponding formulae for the remaining pairs of sides 


are obtained by a cyclic permutation: 
a—b _ tan[(« — £)/2] b—c _ tan[(6— y)/2] c—a _ tan[(y — «)/2) 
a+b tan[(@+)/2])’ b+c  tan[(@+y/2]’ c+a — tan[(y4+«)/2] © 


cos a = cos B = 


From two sides (for example, a and b) and the included angle (7) the other two angles (« and £) 
can be calculated by means of these formulae. Their half-sum (« + £)/2 = 90° — y/2 is given by 
the included angle and their half-difference (« — 8)/2 is given by the tangent formula: from 
(x + B)/2 = — and (« — B)/2 = n one obtains « = + 7 and B = € — n. 

The half-angle formulae. To obtain a formula that is suitable for logarithmic calculations in the 


: : b?+c?—a? . : 
case of three given sides, one substitutes the expression cos « = — given by the cosine 


246 11. Plane trigonometry 


ae into the formula 
$F] (see Chapter 10): 


Oo 2be + b? + c? —=| = (b + c)? — a’? b+e—a b+et+a 1 
w= n=" ESae- Vetee eet 
B y 


2 
Similar formulae hold for cos > and COs --. If one introduces the perimeter 2s of the triangle, 


so thata + b6+c=2sors=(a+6+0c)/2, 
one obtains s—a='/.(b+c—a), s—b='/,(c+a—b), s—c='1/,(a+b—c) 


and hence cos > = V4] : cos £ = Vi] : cos = iS“! . 


Similarly, by substituting the values for cos «, cos 8, cos y given by the cosine rule into the for- 


a 1 — cosa _ Bb 1 — cos B a ae 1 — cosy 
mulae sin > = VI]. sin > = |S] and sin -> = )|—"4 one ob- 


eas the relationships 


$-V "|. a ene sin $=[? =". 


- half-angle pormulee are obtained e OY division of pore pondine sautiar 


| ey | B (s—o(s—a)] = Yoo] 
3 a as ae | “VE " se—5 | tan " sG—o) J | 
For practical calculations it is advisable to calculate all three angles «, 8B, y from the three sides 
a, b, c. The known angle sum of a triangle can then be used as a check. 


The four main cases for the solution of a triangle 


In a triangle the following data can be given: two angles and one side; two sides and one angle 
that is either opposite one of the two sides or included between them; three sides. The method of 
solution for these cases will be given. 


I. Given two angles and a side. Since the angle sum of a triangle is 180°, the third angle is also 
known. By means of the sine rule the remaining sides can be calculated; from c, «, 8, for example, 
it follows that y = 180° — (« + B) and a = csina/siny, b = csin B/siny. 


Example: A force F = 130 units is to be decomposed into two components F, and F, in such a 
way that F, makes an angle 6 = 18° with F and the two components make an angle e = 65° 
with one another (Fig.). The diagonal F = |AC| of the parallelogram ABCD is given. The position 
of the point B is determined by the angles 6 = 18° and w = e — 6 = 47°. In the triangle ABC it 
follows that: 


ane, sin w sin 47° s ees. c 
F, = Yea F — a sin 65° F, = 104.902 units, 

sin 6 sin 18° oe 
F,= ——— F557 ° F, = 44.324 units. 


II. Given two sides and the angle opposite to one of them. Let a,c 4 
and y be given (Fig.); then one obtains 11.2-3 Decomposition of 
l. sina = (a/c)siny; 2.8 =180°—(a+y); 3.b=csin£/siny. the force F into two com- 

ponents F, and F, 

Of course, equation 1. can hold only if (a/c) siny < 1. Because of 
this condition there are several possible cases. 

II (1) a<_c, with the given angle opposite the greater side. There always exists an angle «, 
which must be smaller than y, since it is opposite the smaller side. Moreover, the solution is unique ; 
although the sine function has the same value for the angles «,; and «, = 180° — «,, only a; < 7, is 
a solution of the problem. 


Example: a = 56.9 m, c = 68.0 m, y = 63°57". 


1. sin « = a/e sin y = (56.9/68.0) sin 63°57’; «, = 48°45’; «, = 180° — a, = 131°15’ is greater 
than y. 


11.2. The trigonometric functions in the general triangle 247 


2. B = 180° — (a, + y); B = 67°18’. 
meee - sinB sin 67°18’ 


Gig EeeESr 


II (2) 4 =; the triangle is isosceles and hence « = y. 

(3) @> c, with the given angle opposite the 
smaller side. Then a can be so large that the condition 
sina <1 is not satisfied. JJ (3.1): ”o solution exists i" 
and no triangle can be constructed from the given data; | 
for example, if c= 2”, a=5”",y= 75°. [I(3.2): 
sina may be equal to 1 so that « is a right angle, 
because «2 = 180° — «, =«,. The solution and the 
construction are unigue, for example, if a=2”, c=1”, 


ee 11.2-4 Solution of a triangle, given two 
y=30°. [1(3.3): if sina<1, the angles a, and «2 sides and an angle opposite to one of 


= 180°—.«, can be calculated. Because sina > siny, it them; a) one solution, b) two solutions 
also follows that «, >y,so that (180° — «,) + y< 180° 
and the angle «, also satisfies the geometric conditions. 

The problem has two solutions. 


Example: a = 87,23 m, c = 65.95 m, y = 30.42°. 
1. sin & = (87.23/65.95) sin 30.42°; x, = 42.04°; a2 = 180° — «, = 137.96°; a, > y, 02 >y. 
2. By = 180° — (a, + y); B, = 107.54°, B, = 11.62°. 
3. by = 65.95 m - (sin 104.54°/sin 30.42°) = 126.0 m and 

b2 = 65.95 m- (sin 11.62°/sin 30.42°) = 26.23 m. 


III. Given two sides and the included angle. The solution comes from the cosine rule or the tangent 
formula. Given the values of b, c and « in the triangle ABC, then the cosine rule gives 
a* = b? +- c? — 2be cosa and from this the unique value a = V(b? + c? — 2be cos a). The angle 8 
can also be determined uniquely from the cosine rule, that is, from cos B = (c? + a? — b?)/(2ca). 
However, it is usually preferable to use the sine rule and obtain ’sin B = (b/a) sin «. Of the two angles 
B, and B, that satisfy this equation only one corresponds to the geometric conditions. From 
(vy + B)/2 = 90° — «/2, and by the tangent formula one obtains: tan [(y + B/2] (c — b)/(c + 5b) 
= tan [(v — B)/2]; from (y + B)/2 and (y — B)/2 the angles 8B and y can be found. The third side can 
then be determined by the sine rule; c = a sin y/sin «. 


Example: A cable is to be laid in a straight line through wooded country between two places 
Rand S. They are not visible from one another, but a point A can be found from which the distan- 
ces d= |AR| = 2.473 miles and e = |AS| = 3.752 miles and the angle t = <. RAS = 42°26'10” 
can be measured (Fig.). What must the length x of the cable be and at what angles e, 6d from R, S 
respectively must it be laid? - For comparison two methods of solution are given. 

I, x? = d* + e? — 2de cost e—d e—d e+ 
2 — 6.497313 a eee 2 


_ *= 2.949 miles | 7.¢+6 = 180° — x = 137°33'50” 
2. sine = (e/x) sin T (e+ 6)/2 = 68°46'55”" 

ot 96°40"00" (e — 8)/2 = 27°53'18" 

&, >= - rN SE ETP Sr crear eas 
eae oe 

dy = §4°13'50” t at 42°26'10" 


oe 11:2:5:- Length of ant 

" E€ngtn Of an inacces- = o ‘ayy’? 
Since e > x > d, it must sible side oo ee ee he 
also follow that « > t > 46; this condition A oa sin T = 2 549 miles 
is satisfied only by 62. Therefore the solution is x, €2, 62. Sri sin € 


The agreement between the two results is unsatisfactory. The reason for this (which was discussed 
fully in the introduction) is that the sine function was used to determine an angle in the neigh- 
bourhood of 90°: sin 96°40’00” = 0.99324, sin 96°40'10” = 0.99323, sin 96°40'20” = 0.99323; 
on the number of seconds nothing reliable can be said. A greater precision can be obtained in 
this case if the angle « ist also calculated by the cosine rule, that is, from the en 
cos € = ithe Sad One obtains the uni al — 1A AO? = 96°40'14”, 

é axd . One obtains the unique value cos e = 72.549) (2. 473) ° 
in sufficiently close agreement with the value found by the tangent formula. The pie found 
from the cosine rule is therefore: x = 2.549 miles, €; = 96°40°14", 6, = 40°53'36”. 


248 11. Plane trigonometry 


IV. Given three sides. The solution comes from the cosine rule or the half-angle formulae, that 
sf er ti _b+c?—@ tan & = [A Pe) ie 
is, from either of the equations cos « —— Ss a= Ga) ,a e 
equations obtained from these by cyclic permutation. Both solutions are unique and are obtained 
either from suitable combinations of the six numbers a?, b?, c?, 2ab, 2bc, 2ca or of the four numbers 
s,s — a, s — b, s — c. Therefore each of the three angles «, 8, y should be calculated and the value 
of the sum of the angles of the triangle used as a check. 


Example: Three points R,, R2, R3 on raised ground are to be connected by radar (Fig.). 
At what angles must the transmitter and receiver at each point R,, R2, R3; be built? — 


|R,R2| = c = 45.21 miles; |R,R3| = a = 52.46 miles; |R3R,| = b = 39.37 miles. 


Cosine rule Half-angle formula 
lg 
Re a? = 2752.0516 |} a= 52.46 s—a=16.064 1.20575 
&) b? = 1549.9969 || b= 39.37 s—b=29.15| 1.46464 
/—\ c? = 2043.9441 || c= 45.21 s—c=23.31| 1.36754 
if \, b? +c? — a? = 841.8894 |¥2s = 137.04 —— 5 = 68.52! 1.83582 
c? + a? — b? = 3245.9988 
/ Z a? + b? — ¢? = 2258.1044 
a \ a = 76°19'12" | « = 76°19'12” 
eae. Pe eOe | baer oe 
11.2-6 Solution of a a) eS 5 I id 
triangle, given three sides 180°00’00’ 180°00°00” 


11.3. Further formulae and applications 


In many fields arguments are made precise with the aid of mathematical relations; for example, 
when directions and angles in plane rectilinear figures occur, then theorems of plane trigonometry 
are used. One of these fields, namely surveying, plays a special role. In this discipline the relationships 
in question rest more directly on these theorems than in other fields, and historically the require- 
ments of surveying were responsible for the development of plane trigonometry. For this reason 
the possible applications in this field are dealt with in a special section. 


Geometry 


The radius r of the inscribed circle. In a triangle 
ABC the bisectors of the angles intersect at the 
centre M of the inscribed circle. If one draws the radii 
through the points of contact E, F, G of the sides of 
the triangle (Fig.), then six right-angled triangles are 
formed. They are congruent in pairs and, in parti- 
cular, the pairs of sides marked x, y, z, respecti- 
vely, are equal. Their lengths are -x =s—a, 
y=s—b, z=s—c, where s=(a+b6+c)/2. In 
the triangle AGM, tan(a/2) = r/x=r/(s — a), 
but by the tangent formula for the whole tri- 


angle tan = = }/[S=9S—o — — 2 . Hence “(6 
rr _—y/f{s— 5) —c) A x G J B 
S—a | s(s — a) : c 
_ (s — b)(s — c) 11.3-1 Inscribed circle of a triangle 
r= (s — a) Jas" : 


The same result would have been aa 
obtained by considering tan (8/2) 
or tan (y/2). 


Marking out an arc of a circle whose centre is inaccessible. Between two points A and B, whose 
distance apart e is known, arbitrarily many points P, are to be constructed, all lying on a circle 
through A and B with given radius r (Fig.). The centre of the circle is inaccessible. It is required 


11.3. Further formulae and applications 249 


to find the distance s from A of the points P, and the angle » Pp 
between AP; and AB. Let P be one of the required points. Then 
the triangle AMP is isosceles with base s = 2r sin (o/2) subtend- 
ing an angle o at the centre of the circle. The angle ~ PMB 
subtended at the centre by the chord PB is e — o and the angle 
at the circumference is gy. Thus, g = (e — 0)/2 or o = « — 29. 
But the angle « in the triangle ABM can be determined from 
e = 2r sin (e/2), so that sin e/2 = e/2r. Hence the distance s in 
dependence on the angle g is given by: s = 2rsin(o/2) 
= 2r sin (e/2 — ~), where ¢/2 = Arcsin (e/2r). 


Area of a triangle. From the area formula A = ch,/2 and 
h, = b sin « it follows that A = (bc/2) sin«. From the relationship M 
sin « = a/(2R), where R is the radius of the circumcircle (see  11.3-2 Marking out an arc of a 
Fig. 11.2-1), it follows that A = abc/(4R), or from b= 2RsinB circle 
and c=2Rsiny, A = 2R? sin« sinf siny. Again, since 

a 2 sin f siny 

= : : A= a@—_———- 

2 sin « 2 sin « 
ABM, BCM and CAM with altitude equal to r, the radius of the inscribed circle, one obtains 
Heron’s formula: 


A= ')/2(er + br + ar) = rs = y[s(s — a) (s — 6)(s—c)], where 2s=a+b+c. 


. In Fig. 11.3-1, by adding the areas of the partial triangles 


Example: It is required to calculate the area of a triangle with the sides a = 345.8, b = 236.5, 
c = 497.3. Using Heron’s formula one finds thats = 539.8, s— a= 194.0, s— b= 303.3, 
s — ¢ = 42.5: hence, with the help of 4-figure logarithm tables, A = 36740. c 
A rough calculation is always to be recommended; in this case by using a 
slide rule one obtains: 
A = (539.8 « 194 + 303.3 - 42.5) 
ew (5.40 - 10? - 1.94 + 10? - 3.03 - 10? - 4.25 - 10) 
= 10° /(5.40- 1.94. 3.03 - 42.5) = 36700. 
The position of the decimal point was estimated as follows: 
Ae 10° /(10- 3+ 10+ 4.2) = 10* 712.6 & 3.5: 10*. 


Isosceles triangle. If the equal sides are denoted by a, the base by ¢ and 


the base angles by « (Fig.), then the area A ist given by: 11.3-3 Area of an 
1.A ='/,a? siny, where y = 180° — 2c; isosceles triangle 
c?sin?a« c? sin? « c? sin? « c? 
2. A = = ——, A= —Itane; 
2 siny 2 sin 2« 4sina cosa ’ 4 | 
3.s=a+c/2,s—a=c/2,s—c =a — e/2, and hence ae 
A = Y[(a + e/2) (c/2) (c/2) (a — c/2)] = (c/2) V(a? — c?/4) Xs, 
= (c/4) V(4a? — 2). R vA YN 
Equilateral triangle. Each side is of length a. J My “Gr A 
1, A = 1/,a? sin 60°, A = (a2/4) 3. 1+ gh ANE 
2. By Heron’s formula: s = 3a/2, s—a=s—b=s—c fg 
= a/2 and A = y[(3a/2) - (a3/8)] = (a2/4) V3 = a? 3/4. {y 
Regular hexagon. This polygon is composed of six equila- \ , aa R 
teral triangles of side R (the radius of the circumcircle) and \\ DF al a Os eh 
it therefore follows that \\ jf. | \ : } 
Ag = (6/4) R23, Ag = (3/2) R? 3. " J ‘. : 
Regular n-sided polygon. Its area is composed of 1 iso- : / \ | 
sceles triangles in which the equal sides are radii of the cir- : \ 


cumcircle and the angle g at the centre of the circle included . 
between them is the mth part of the complete angle; : 
Pn = 360°/n (Fig.). 11.3-4 Regular n-sided polygon 


250 11. Plane trigonometry 


In each single isosceles triangle the altitude A, bisects the side s, of the polygon and the 
angle at the centre y,. Thus, s, = 2Rsin(,/2), — 

h, = Roos (7/2), so that <A,= = (n/2) SrAn 
= (n/2) R? - 2 sin (P,/2) cos (9,/2) = (n/2) R? sin 9, 
= (n/2) R? sin (360°/n). 

The general quadrilateral. For the general quadrilateral ABCD a formula analogous to Heron’s 
formula can be derived. Since a quadrilateral is determined by five data, one can take as given the 
four sides and the sum of a pair of opposite angles, for example, « and y (Fig.). Denoting the semi- 
perimeter by s, s = '/.(a + 6+ c+ 4), and the sum of the angles « and y by 2s, the areas of the 
triangles ABD and BCD and the area A, of the quadrilateral are given by: 

A; = 1 /,ad sin «; Ay = */2be sin y; 
A, = '/2(adsina« + be siny). 
Using the cosine rule in the two triangles one obtains 
a? + d? — 2adcosa = f? = b? + c? — 2be cosy 
or a? + d? — b* — c* = 2Aadcosa — bcecosy). 
Then (4A,)? + (a? + d? — b? — c?)? 
= 4(a*d? + b*c? — 2abcd cos 2e), and finally, 


1642 = (atd+b—c)(a+d— b+c)(b+c+a—d)(b+c—a+d) 
— 16abed cos” a 7 


11.3-5 Area of a general quadri- 
lateral 


if g is the angle at which the diagonals of a aaadniateeal intersect at the point S, then the area A, 


af the quadrilateral can be expressed as the sum of the areas of the four triangles ABS, BCS, CDS 
a DAS, so that 


= */2 [|AS| - |BS| sin (180° — y) + |BS| - |CS| sing + |CS| - |DS| sin (180° — ¢) 
+\DS|- |AS|sin g] = */2 {AS| BS] + |DS) + |CS| (\BS| + [DSi sin g 
= */2 [|AS| + |CS}) Y (BS + |DS|)] sin 9, 
A, = '/2 ef sing. 
“Thus, the area is equal to half the product of the diagonals and the sine of the cngle between them. 


Cyclic quadrilateral. In a cyclic quadrilateral the sum of a pair of opposite angles is 180°, that is, 
a+ y=a2 = 180°, ¢ = 90°, cose = 0. 

Hence the general formula for the area simplifies to 4.. = V[(s — a) (s — 5) (s — c) (s — d@)]. 
Because the term abcd cos? « is never negative, the areas of all other quadrilaterals with the same 
sides are less than that of the cyclic quadrilateral. 


Among all quadrilaterals with sides a, b, c, d the cyclic quadrilateral has the greatest area. 


Physics 


All physical quantities that can be represented by vectors (for example, force or velocity) require 
the use of trigonometric functions for their calculation. 


Example 1: A beam B is fixed at right angles to a wall (Fig.) and a load of f lb. wt. hangs from 
the free end. The beam is supported by either a) a tie T or b) a strut S making an angle « or §, 
aii with the beam. Find the forces (tension or thrust) occurring in B and T or 5S, respec- 
tively. — 

The weight of the load f is the resultant of two forces, one in the direction of the beam B and 
the other in a) the direction of the tie T or b) the direction of the strut S. Since f is perpendicular 
to B, the triangles 7,7;7, and $,58,S, are right-angled; a) there is a thrust d= fcot « in the beam 
and a tension ¢ = f/sina in the tie; b) there is a tension z = fcot 8 in the beam and a thrust 
s = f/sin B in the strut. 


Example 2: An aircraft has an average speed v; = 360 m.p.h. and flies in the direction N 23.5" E 
from a place A to another place B distant 300 imhiles from A. The wind speed is v, = 45 m.p.h. 
towards the direction N 18° W. Find the course along which the aircraft must fly and the time 
it takes to reach B. — 

With no wind the aircraft would reach B in 300/360 h = 5/6 h, that is, in 50 minutes. Because 
of the side-wind it must fly in the direction Na,E (Fig.). By using the parallelogram of velocities 
the angle (x, — «,) in the triangle ACE can be calculated from three given quantities, namely, 


11.3. Further formulae and applications 251 


the sides v, and v, and the angle { AEC = « = «, + «3. By the sine rule 


sin (%; — %,) = sin (a, + 2) v2/r,, 9 P : 
sin (180° — % — a3) ia ee of an aircraft with 
sin (~; + o) ‘ / cA ea ae 
since a3; — a, = 4,75°, / | — ——— 
& = 28.25°, ve 390.18 m.p.h. | . ry 


and vp=p, 


11.3-8 Height and distance of 
a lightning flash 


11.3-6 Beam: a) with a fie, 
b) with a sfrur 


The aircraft flies in the direction N 28.25° E and at a speed of v = 390.18 miles per hour reaches 
B in approximately 46 minutes. 

Example 3: A lightning flash is observed at an angle « to the horizontal and the thunderclap is 
heard f seconds later at the point of observation. It then follows that the flash occurs at a distance 
e= 3331 mand at a height A = 3331 sin « m (Fig.). Here 333 m/s is the speed of sound, and because 
the gecngied light is c = 300 000 km/s, the time taken for the light to travel to the point of observa- 
tion can be n 


Technology 


The laws of technology are applied laws of physics. The trigonometric functions and theorems 
occur in just the same way as soon as angles play a part. 


Crankshafts. In a crankhaft the position of the big end K is a function of the angle of rotation » 
of the crank (Fig.). If r is the radius of the crank and / the length of the connecting rod, then by the 
cosine rule: 

[? = x? + r? — 2xrcos (180° — ¢) mm x? + 2rx cos y = |? — r?. 
The solution of this se ie equation gives 
x = —rcosg + Y(r? cos? » + I? — r?) 

= —rcosy + V{r*(cos? p —1)+ P}, 

x= —rcosy + Y(l? — r? sin? 9). 11.3-9 Crankshaft r 


— 


Length of a driving belt. If two pulleys of radii R and r have axes at a distance a apart, one can 
calculate the length L of the driving belt that lies taut around them both (Fig.): t? = a? — (R — r)’, 
cos « = (R — r)/a or « = Arccos [(R — r)/a] in radians. From this it follows that 


L=2t+K+k=2y[a? — (R—r)?]+ RQa — 2a)4+ r- 2a, 
L = 2{y[a? — (R — r)?] — «(R — r) + Ra}. 
For the case r = R/2 and a = 2R one obtains L = 8.838R. 


y 


11.3-11 Parallelo- 


ae 
(os : 
gram of forces 


11.3-10 Length of a driving belt fad§_— ow r 

The parallelogram of forces. A street lamp is suspended from two ropes of unequal length that are 
inclined to the horizontal at angles « and #, respectively. If the sag of the ropes can be neglected, 
the tensions S, and S, in the ropes can be calculated using the sine rule (Fig.). Bearing in mind 


252 11. Plane trigonometry 


that sin (90° — x) = cos x, one obtains: 
S, = Fcos B/sin(« + 8), Sz = Fcos«/sin (« + £). 


Motion on an inclined plane. A body of weight W lies on a plane inclined at an angle « to the 
horizontal. It is required to find the force F, in the direction of the plane (Fig.) that will move 
the body up the plane with constant speed and the force F, in the direction of the plane that is 
necessary to prevent the body from sliding down. The force of friction R is proportional to the normal 
reaction N between the body and the plane; R = “uN, where uy is called the coefficient of friction. 
Putting 4 = tan 9, the angle of friction @ is the inclination of a plane on which the given body just 
fails to slide down. In the triangle of forces (Fig.) the force F ist increased in the first case and 
decreased in the second case by the force of friction R = N tan 0 to give F; = F + Rand F, = F— R, 
respectively. By the sine rule: 

F,/W = sin (« + @)/sin(90°— 9), F, = Wsin(« + @)/cose 
and 


F,/W = sin (« — 0)/sin (90° + @), F2 = Wsin(« — @)/coso. 


11.3-12 Motion on an 
inclined plane with the corre- 
sponding triangle of forces 11.3-13 Forces on a fixed pulley 


Forces on a fixed pulley. To calculate the friction between a fixed pulley and the shaft one must 
obtain the resultant force F, of the load Q and the applied force F. If the rope wrapped round the 
pulley subtends an angle y at the centre, then F, is given by the cosine rule (Fig.): 


F, = V(F? + Q? — 2FQ cosy). 
For a smooth pulley the rope tensions F and Q are equal to one another, and since 2 — 2 cosy 


= 2(1 — cosy) = 4 sin? (/2), the equation can be simplified to give F, = 2Q sin (y/2). For y = 180°, 
F and Q are parallel and F, = 2Q. 


Navigation 


To determine the position of a ship and its path on the sea the earth is regarded as a sphere. The 
calculations must therefore be based on the laws of spherical trigonometry. Methods referring 
to stars or satellites for determining position are also based on them. Smaller regions, for example 
for journeys near a coast, may be taken as plane. The location and relative distances of characteristic 
points marked on a coastal map are assumed to be known. 

The direction of motion of a ship, its course, is fixed by the angle between the direction of its 
keel and a fixed reference direction. The true course is measured from the geographic north towards 
the east up to 360°; the magnetic course’is measured from the magnetic north in the same sense. 
The difference between the two, that is, the angle between the 
geographical and the magnetic meridian, is called the declina- 
tion. On an iron ship with its own magnetic field the compass 
meridian deviates from the magnetic meridian by an angle 
called the deviation, which depends on the position of the 
ship and its course. The compass course so determined is, 
however, measured directly. The course is given as an angle 
between points of the compass, for example as N 35° E, read 
north 35 sas die east. 


: bf EX ily y é, d FE . om a orth 
53° E and a church tower Kin Seecae a 
‘ | "| : Tu! in BOL r ACCOTOIN B to 33 neuen! males 


11.3-14 Ship’s course 


11.3. Further formulae and applications 253 


a) Find the distances _ = x and |FL| = y (Fig.); 
b) what course must hip maintain if it is to pass by the lighthouse at a distance c = 4 nautical 
miles = 7.408 km? (Calculation of the circle of danger). — 


From the given values «= 55.3°, 6B = 28.5°, y = 84.7°, s=33.25km one obtains 


5 = 180°—(a + y) = 40°, e= y — B= 56.2", x = ern 21.49 km 11.6 nav- 
. ae Cn = ee 
tical miles, for ree, = 27.77km *& 15 nautical miles, sing = c/y, g = 15.47°. 


Ship’s course: S (« + ¢)° E, that is, S 70.77 ° E. 


Trigonometric determination of heights 


In practice angles in a horizontal plane can be measured with greater precision than those in a 
vertical plane, because light does not travel in an straight line through air of variable density. In 


addition to this terrestrial refraction, for distances over 200 m the curvature of the earth must also 
be taken into account. 


Schematic construction of the theodolite. The theodolite is the instrument used for measuring 
angles in surveying. The many instruments used for special applications are all based on a simple 
principle. A vertical hollow spindle AS is attached to three levelling screws Sc which rest on a base 
plate, often fixed to a tripod (Fig.). The spindle AS carries a horizontal circular disc D with a ring 
calibrated in the clockwise sense, called the /imb L. The alidade A is a circular disc, free to rotate 
in the spindle, having two diametrically opposite pointers and carrying a spirit level sL and two 
supports Sp for the telescope axis aT. Rigidly attached to this axis are the telescope T and the vertical 
circle V for measuring angles of height. The alidade ist made horizontal by means of the levelling 
screws and the spirit level, and in a good theodolite the axis a about which the alidade turns must 
then be vertical. The axis aT is then horizontal and the telescope T is at right angles to it. Methods 
exist to eliminate small deviations from these conditions by preliminary measurements (called 
adjustments) or to determine their magnitude in order to allow for them in the measurement proper. 
The quality of a theodolite depends essentially upon the accuracy of the two scales on the horizontal 
and vertical circles as well as on the reading device R, which could be a pointer, a vernier or a micro- 
scope. Accordingly the angles can be read, for example, to within minutes or seconds. To increase 
the precision of the angle measurement, a definite procedure is adopted and the readings repeated. 
The telescope is considered to point correctly towards the object in question if its image, or a charac- 
teristic portion of it, coincides with the cross hairs of the instrument (in the simplest case a horizon- 
tal and a vertical line). It follows from the construction of the theodolite that horizontal angles can 
also be measured even if the objects aimed at are at different heights. The reference position of the 
horizontal scale is immaterial, since the horizontal angle is always obtained as the difference of two 
readings. On the other hand, for the measurement of vertical angles the zero reference direction 
must be adjusted horizontally by using a spirit level. 


11.3-15 Theodolite (schematic) 


11.3-16 Tacheometrical levelling 
for a horizontal sighting 


Tacheometrical levelling. Tacheometry means fast measurement. It ist used to determine from the 
position and height of a known point P the positions and heights of a whole series of new points 
merely from theodolite readings. It can, for example, be used to investigate the surface geometry 
of terrain as a basis for a building project. For this purpose there are two horizontal hairs, an upper 
one u and a lower one /, parallel to the horizontal cross hair m of the telescope and at equal distances 
p/2 above and below it, respectively. A rod (called a /evelling staff) is set up vertically at the new 
point N and the images of the three hairs mark out three points L,,, L, and L, on the rod (Fig.). 


254 11. Plane trigonometry 


The difference L, — L, between the upper and lower readings is the rod section s. Because of the 
inversion of the image the greater numbers on the levelling staff lie at the top. The rod section s 
appears from the instrument to have an angle of parallax «. Depending on the distance p between 
the horizontal hairs, the focal length f of the object lens and the path of the light in the telescope, 
the horizontal distance a of the staff from the theodolite for a horizontal sighting can be found in 
the form a = Cs, in which the instrument contant C usually has the round value 100. For the angle 
of parallax « it then follows that tan (¢/2) = s/(2a) = 1/(2C). If the line of sight from the telescope F 
to the middle reading L,, is inclined at an angle « to the horizontal (Fig.), then the horizontal 
distance a’ can be determined by the following trigonometric calculation: 


s = |AL,| — |AL,| = a’[tan (« + e/2) — tan (« — e/2)] 
____, Sin (@ + €/2) cos (« — &/2) — sin (« — &/2) cos (a + é/2) 
=e cos (a + €/2) cos (a — é/2) 
a sin € 
~ * “Cos? « cos? (e/2) — sin? « sin? (e/2) ’ 


fe cos? « cos? (e/2) — sin? « sin? (€/2) 
7 2 sin (€/2) cos (e/2) 
Ss cos?« 


= 5 eed. sin? « tan (¢/2), 


11.3-17 Tacheometrical levelling for 
an inclined sighting 


Finally, the second term can be neglected because s and sin? « are small and C = 100. For the 
horizontal distance this gives a’ = Cs cos? « and from this approximate value the height 4 = |HL,,| 
is given at once by A= a’ tan « = '/2Cs sin 2x. The difference 4h between the heights of the points P 
and N depends not only on / but also on the height i of the theodolite and on the length of the rod 
|NLm| = z above the point N: 4h = i+ h — z. 


Calculation of heights with the help of vertical angles. In the following examples terrestrial refraction 
will be neglected, either because the sighting distance does not exceed 200m or because a lesser 
precision in the results (decimetres or even yards) is acceptable. 

Horizontal base line and vertical angles. If a horizontal base line |AB| = s in the direction to the 
point G in the terrain is measured, and also the vertical angles « and f of G from the end-points A 
and B (Fig.), then the height 4 of G can be calculated by the sine rule: 
l.y=B—a«;2.u=ssina/siny and 3. h = usin £, so that h = s sin« sin B/sin (8 — «). 

Inclined base line. The angle f of inclination of a base line |AB| = s lies in a vertical plane passing 
through G, and « and y are the vertical angles of G from A and B (Fig.). The difference in height 
between A and G can be calculated by the sine rule. If |AG| = x, then h = x sina, where 
x = ssin ¢/sino, e = B + (180° — y) and o = y — «a. By substitution one obtains 
h=ssinesina/sino or h = s sin (y — B) sina/sin (y — «). 

Base line in an arbitrary direction relative to the line of sight. The vertical through G meets the 
horizontal plane through the horizontal base line |AB| = s in the point F. From the end-pomts A and 


—_ 
11.3-18 Trigonometrical deter- 11.3-19 Trigonometrical deter- 
mination of heights with a Aori- mination of heights with an 11.3-20 Trigonometrical deter- 
zontal base line in a vertical inclined base line in a vertical mination of heights with a hori- 


plane through the summit plane through the summit zontal base line 


11.3. Further formulae and applications 255 


B the angles <{ FAB = y and <FBA = 6 are measured and also the angle of elevation ¢ of G from A 
(Fig.). The plane AFG is perpendicular to the horizontal plane FAB. If |AF| = z, then A = z tane, 
where z= s sin 06/sino = s sin 0/sin [180° — (y + 6)] = s sin 6/sin(y + 6). By substitution one 
obtains: A = s sin 6 tan é/sin (y + 9). r 

If the base line |AB| = s is inclined to the horizontal at an angle e; 
with B above A (Fig.), if « and 8, respectively, are the horizontal 
angles between the base line and the point G measured from A and 
B, and if « and e€2 are the vertical angles of G measured from A 
and B, respectively, then the problem can be reduced to one of 
the type already solved. In the triangle 4HB’ in the horizontal 
plane through A the side s’ = s cos €, and the angles « and # are 
known. By the sine rule it follows that a’ = s’ sin «/sin (« + B); 
b’ = s’ sin B/sin (« + B). From the triangle AHG lying in a 
vertical plane one can then deducet hat: h= bd’ tane 
= s’sin B tan e/sin(« + B) = scos €; sinf tane/sin(« + B). As 
a check one has h= h, + hz, where h, = s sine, and h, =a’ tan €, 
= 5’ sine tan €2/sin (a + 8) = s cos €; sina tan €2/sin (a + 8). 


11.3-21 Trigonometrical determination of heights with an inclined 
base line 


Surveying 

The ultimate aim of surveying is to fix uniquely any desired point of the earth’s surface. This is 
achieved by means of coordinates or in pictorial form in maps. As a first approximation the earth’s 
surface is regarded as the surface of a sphere on which the position of a point is fixed by the longitude A 
and the Jatitude y, that is, by the intersection of a great circle through the north and south poles 
(a meridian) with a latitute circle. The meridian through Greenwich is taken as the zero meridian 
and all other meridians are measured by their angular distance A from this one, where / is between 
0° and 180° towards the east or the west. The latitude circles are small circles parallel to the equator. 
Their distance g from the equator is measured in degrees of angle on a meridian from 0° at the equator 
to 90° at the pole on each side, giving northern and southern latitudes. These coordinates are spherical 
coordinates and are determined by astronomical measurements, as will be shown in Chapter 12. 


Gauss—Kriiger or transverse Mercator projection. A cylindrical or a conical surface can be cut 
along a generator and rolled out, or developed, onto a plane. In this process all lengths, areas, and 
angles remain unchanged; they therefore appear in the map with their true size. The surface 
of a sphere is not developable. Hence, following C.F.Gauss and the one-time Director of 
the Potsdam Institute of Geodesy, J. H. L. KRUGER (1857~1923), lunes (meridian strips) whose 
bounding meridians include an angle of 6° at the poles are mapped onto the surface of a cylinder 
that touches the lune along its mean meridian (making an angle of 3° with each bounding meridian). 
The figure shows approximately how narrow these strips which cover the whole earth are. It is 
therefore understandable that lengths and areas deviate only a little from their true values. A plane 
map of the meridian strip is then obtained by developing the cylinder. The meridian of contact m 
belongs both to the sphere and to the cylinder, and therefore appears in the plane with its true 

— : length. If the distance from the equa- 
tor of a point on this meridian is 
given by the angle € measured in 
radians then the corresponding point 
in the plane has coordinate x = &R, 
where R is the radius of the earth. 
Great circles that cut the meridian 
of contact at right angles and there- 
fore have as poles the points of 
intersection A,, A, of the axis of the 
cylinder with the sphere (Fig.) give 
| rise to lines on the map that cut the 

oe . x-axis at right angles, that is, to 

et oR ahd SONIC ULINE ae Gauss-Kriger coor- generators of the cylinder. Distances 
7 eal of a point P of the sphere from the 

meridian of contact m are measured on these orthogonal great circles by the angle 7; the image 
point P’ has the corresponding distance y from the x-axis. The relationship between 7 and y can 
be derived from the requirement that the Gauss-Kriger projection is angle-preserving (or conformal, 


256 11. Plane trigonometry 


see Chapter 23.). The conformal property is ensured if triangles on the sphere are mapped 
onto similar triangles and the magnification for lines is the same in all directions. A small circle 
k through the point P that cuts all great circles through the poles A, and A, at right angles is 
regarded from these poles as a latitude circle of latitude 7. The orthogonal great circles play the 
part of meridians and the meridian of contact that of the equator. The length of the small circle is 
therefore / = 2xR cos 7. On the other hand, on the cylinder and in the plane L = 27R, so that 
the magnification is L/1 = 1/cos n. By the conformal property the magnification is equal to dy/dn. 
Consequently dy/dy = 1/cos 7, or dy = dn/cos 7. By integration it follows that: 
y = In tan (2/4 + /2) = (1/M) lg tan (7/4 + 7/2), where 1/M = 1/lg e = 2.302585 1 

(see Chapter 2.). 


North directions. Fig. 11.3-23 shows an angle y at the point P between the meridian to the 
north pole N and the small circle k parallel to the meridian of contact m. This angle is called the 
convergence of the meridian and gives the deviation of the grid north from the geographical north 
for the point P. Geographical north (or true north) is the direction from the point P along the meridian 
towards the north pole. Grid north ist the direction from the image point P’ in the Gauss-Kriger 
plane parallel to the x-axis. On the sphere it corresponds to the direction of the tangent at P to the 
small circle k. Accordingly it is possible to fix the direction of a line in different ways. The azimuth « 
of a line at one of its points P is the angle between the true north and the line, measured in the 
clockwise sense from the meridian. The angle between grid north and the line measured in the same 
sense is called the direction angle v and is usually denoted in terms of two points P, and P, of the 
line in the form » = (P, P2), where (P, P2) = (P2P,) + 180°. For the sake of completeness it should 
be mentioned that sometimes a third north direction, magnetic north, is used; it differs from true 
north by the deviation of a compass needle. With respect to magnetic north one speaks of the declina- 
tion of the line. 


Latitudes and departures. The x-value in Gauss—Kriger coordinates is measured along the meridian 
of contact towards the north (or south); it is called the /atitude and gives the true distance from the 
equator. The y-value is called the departure. Positive y-values denote points that lie to the east of 
the meridian of contact m. In order to avoid negative y-coordinates, the contact or mean meridian 
is not given the y-coordinate 0 m, but 500000 m; at the same time the meridian strip on the sphere 
from which the map in question has been derived is given by a characteristic number in front of 
this value. This characteristic number is 1 for the meridian of contact 3° (1500000 m), 2 for 9° 
(2500000 m), 3 for 15° (3500000 m) and so on: (A,, + 3)/6. A point that lies 65370 m east of the 
meridian 3° thus has the y-coordinate 1500000 m + 65370 m = 1565 370 m. This number is called 
the right value. For a point that lies 74 250 m west of the meridian 9°, the right value is 2500000 m 
— 74250 m = 2 425 750 m. Conversely, one reaches a point with right value 4374 981 m and latitude 
5 755 899 m by going 5755899 m to the north from the equator on the 21°-meridian (because 
4-6 — 3 = 21) and then in the perpendicular direction through 500000 m — 374 981 m = 125019 m 
to the west. On topographical maps the right values and latitudes are given only in whole kilometres 
and the first two numbers are written as superscripts, for instance, as +774 and °’55 for the last 
example. Since the distortions in lengths are greatest near the bounding meridians, the coordinates 
of important points are calculated in addition by using strips of width 0.5° on each side of each 
bounding meridian. Thus, for points of a strip 1° wide, about 70 km wide at latitude 52°, every pair 
of coordinates is available both for westerly and for easterly meridian strips. 

On topographical maps and for geodesy meridian strips of width 3° instead of 6° are also used. 
The same considerations and notation hold for these; only the characteristic numbers are different. 
They are 1, 2, 3, ... (A,,/3) for the contact meridians 3°, 6°, 9°, ..., Ane. 


Triangulation. Only for a few points does one determine the geographical coordinates and the 
azimuth of line segments between them and convert them into Gauss-Kriiger coordinates. Other 
characteristic points, called trigonometric points (TP), are connected with them by means of a 
triangular framework of the first order in which almost all angles are measured. In the schematic: 
figure the geographical coordinates of the four points characterized by the north direction N are 
determined and the azimuth a of a line segment between any two of them (Fig.). The length of this line 
is calculated by means of a basis framework. The framework connects each such line with a basis 5 
from 4 to 10 km long on level ground, which is measured with great accuracy using free hanging 
invar wires of length 24 m at a tension equal to the weight of 5 kg(~ 11 Ib.). A mean error is achieved 
of 8 mm in 10 km, or 1 in 1250000. The sides of the triangular framework of the first order are on 
average 40 to 70 km long; they are marked in thick lines in the figure. Between the trigonometric 
points of the first order thus determined in Gauss-Kriiger coordinates the points of the framework 
of the second order are superimposed, merely by angle measurements. Its sides are on average 20 km 
long, those of the framework of the third order 5 to 10 km long, and finally those of the framework 
of the fourth order 2 to 5 km long. These trigonometric points of the first to the fourth order are 
marked by a granite plate engraved with a cross and sunk in the ground, and a vertical rectangular 


11.3. Further formulae and applications 257 


11.3-24 Triangular framework 


granite pillar with a cross standing over it (Fig.). In order to be able to 
see the TP from a greater distance and perform observations upon it a 
signal is erected over it (Fig.) which at the same time forms a good protec- 
tion from damage. Triangulation frameworks whose triangles are arranged 
in a strip are called triangular chains. A triangular chain along a meridian 
of the earth was once used to determine the shape 
of the earth. As a result of the development of 
radar and other methods, with the help of elec- 
tromagnetic waves, of measuring distances with 
greater precision, trilateration has become possible 
as an alternative to triangulation. While astro- 
nomers up to now have tried to determine the 
distance to a planet, usually Venus, by measuring 
angles from two places on the earth separated by 
at most a distance equal to the diameter of the 
earth, they are now determining these distances 
from the time taken for a radar signal to travel. 
The aim of this investigation is to determine 
more accurately than before the astronomical 
unit, the distance between the earth and the 413-25 Trigonometric wi 
sun. point (TP) 11.3-26 Signal 


Bench mark systems. Two lines radiating from the centre of a series of concentric spheres intersect 
the surface of each sphere in two points. The greater the radius of the sphere, the longer the line 
joining the corresponding points of intersection. From this geometrical fact it follows that between 
two plump lines, each hanging in a deep shaft, one measures different distances according to the 
height at which the measurements are made. The distance between two earth radii is greater in the 
mountains than at sea level. Consequently in surveying all lengths must be calculated at the same 
height, at sea level. Hence the height of each calculated point above a zero level must be measured. 
For this purpose a network of points of known height, called bench marks, is determined. The datum 
of the Ordnance Survey is determined by ‘the approximate mean water at Liverpool’, and all 
levels on the maps of that survey are altitudes above this datum. 

To measure differences of height instruments called /evels are used. The telescope axis of the level 
must be exactly parallel to the axis of 
a delicate spirit level, and thus exactly 

horizontal. A backwards reading is then 
taken on a levelling staff L, provided 
with a centimetre scale and held ver- 
_ tically at a point R, and then a forwards 
reading on a staff L, at the point V (Fig.). 
The difference r — v = d between these 
two readings then measures how much 
higher V lies than R. Centimetres are 
~| read off on the staff and millimetres 
5 are either estimated or measured as 
11.3-27 Geometric levelling deviations on a parallel sliding plate. 


258 11. Plane trigonometry 


In the lower part of the ftgure a chain of measurements with the level is indicated, whereby 
at each intermediate point, for example D, one backwards reading is taken and then one forwards 
reading after moving the level, for example from C to E. By algebraic addition of the differences d 
one obtains the difference in height between A and B. Such a chain is called /evelling, or double 
levelling if the measurements are repeated. With a good level the mean error of a double levelling 
of length 1 km is +0.4 mm. 


Determination of new points. Points with known coordinates, for example, trigonometric points» 
are designated fixed points. Points whose position is to be determined are new points. 

Forward section. From two fixed points F; and F,, 
whose distance apart s is known, a new point N is to 
be determined by angle measurement (Fig.). If the 
theodolite can beset up only at the fixed points, one 
speaks of a forward section. If only one of these 
points is accessible, but on the other hand an angle 
measurement from the new point is possible, the 
procedure is called a sideways section. Of course, 
one usually tries to measure all three angles in the 
triangle F,F,N in order to use the angle sum as a 
check on the angle observations. Geometrically one 
side and three angles of the triangle F,F,N are 
always known, and the remaining sides s, and s2 can 
be calculated using the sine rule: 


S$; =ssina/siny, s2 = s sin (360° — f’)/siny. 


Geodetically the fixed points F, and Fy, are given 
by their latitudes x,, x2 and right values y,, y2. In : ae 
the right-angled triangle HF,F2 (Fig.) the direction . on a a 
angle (F,F2) and the length of the line segment 
F,F, can be calculated from the differences y, — y, and x2, — x, of the coordinates. The direction 
angle is measured from grid north in a clockwise sense. In the triangles F,NH, and F,H,N, using 
the lengths of the sides s,; and s2 already calculated, the coordinate differences Jy, , 4x,, 4x2, Ay2 
can be determined. Added to the coordinates of F, or F, these give the coordinates of N. As a 
check the coordinates of the new point are twice calculated. 


Example: It is given that x, = 2524950.98, », = 5711619.35 and x, = 2525 616.57, 
¥2 = 5710 664.92, and the angles « = 61°13°33” and 6’ = 328°32'15” are measured from the given 
line segment s to the new point N. 


1. Angle y = 180° — « — (360° — f’) B = 360° — p’ = 31°27'45” 
y = 87°18'42” 


2. Direction angle (FF): 
v2— ¥1 


tan (F, F,) = ————_— Y2 — ¥1 = —954.43 
xy — XX 
Fim = Xy— X, ne ¥a— Fi. X2— %1 = + 665.59 | 
a Sa SD (Fi Fz) = 304°53'24", 6 = 34°53'24” 


tan (F,F;) = —cot 6 
cos (F,F,)= sind, sin (F,F,) = —cosd 


FF, = 1163.6 

3. Lengths of the sides s, and s>: 

5S; = |F,F,| sin «/sin y = |F,N| s; = 1021.0 = |FLN| 

Sz = |F,F,| sin B/sin y = |F,N| Sy = 608.9 = |F,N) 
4. From F, to the new point: 

(F,N) = (F, F2) + « (F,N) = 366°06'57" = 6°06'57” 

Xy — X,; =|F,N|cos(F,N) = Ax, Ax, = +604.53, Ay, = +64.78 

Yn — Yi. =|F,N|sin(F,N) = Ay, Xy = 7°25555.51, yy = 7711684.13 
5. From F; to the new point: 

(F2N) = (F2Fi) + B (F2F,) = 124°53’24" 

Xy — X2 = |F,N | cos (FzN)= Ax, (F2N) = 453°25'39" = 93°25'39" 

Yu — Y2 = |F2N| sin (F,N)= Ay2 Ax, = — 61.04, Ay, = +1019.21 


Xy = 2525555.53,  ¥y = 5711684.13. 


11.3. Further formulae and applications 259 


Backward section. If three fixed points F,, — ee 
F,, F3 are given and observation is possible ' ee 4 fs Bes 
only from the new point N (Fig.), then one ae Ng i 
speaks of a backward section. The new point (x \— \ 
must be chosen in such a way that it does not Vas h 
lie on the circumcircle of the triangle -{ 
F,F,F,. The most accurate result is obtained 
if N lies in the interior of this triangle. Let 
the angles of the triangle to be calculated be 
@, at Fi, 92 at F, and 3 at F3, and let 
the angles measured from N be <_F,NF3 = 71, 
JL F3NF; = V2 and LF, ,NF2 = V3. A solution 
for machine calculation arises from the co- 
ordinates for the centre of gravity S of the 
given triangle: s, = (x; + x2 + %x3)/3, sy 
= (); + y2 + y3)/3. In these formulae the 
vertices have the same weight with respect to 
the medians. If one gives them different weights 
£1, £2, &3, this gives rise to different transver- 
sals through the vertices, which can lie outside 
the triangle if some of the weights have nega- 
tive values. The point of intersection N of the transversals then has the coordinates 


X = (81%1 + B2X2 + 83%3)/(81 + 82+ 83), Y= (811 + B2¥2 + B3y3)/(e1 + G2 + 83)- 

The weights can be obtained from the mechanical concept that with respect to a transversal 
through one vertex the moments of the other two vertices must be equal. Let G be the point of inter- 
section of the transversal F,N with the circumcircle, G, and G2, respectively, the feet of the perpen- 
diculars h, and h2 from F, and F;, to this transversal. The moments of the vertices F, and Fy, are 


h h 
equal: hence g,4, = g2h2 or g1/g2 = h2/hy = — / 1 where: 


11.3-29 Backward section 


IGN|/ |GN| 

hp h2 _ l _ I 
IGN|  |GG2)+|G2N|  |GGa\/hg + |G2N|/h, cot y, — cot 4 

Ay hy _ 1 _ 1 
IGN | |GG,| + IG, N| |GG,|/hy + IGN |/hy cot g2 — cot v2 

By cyclic permutation and substitution one obtains: 
1 1 1 

eee coty,—coty,  coty~,—cotr,  cotg; —cotr; ° 


Since an arbitrary proportionality factor makes no difference to the coordinates of the new 
point, it may be taken equal to 1. The weights are then: 
1 1 1 


$1 Coty, — cot»; ’ $2 Cot 2 — cot 72” oa cot p3 — cot v3 — 


From the coordinates of the fixed points F, , F,, F3 (see forward section) one finds: 
xX; = 2524950.98, x2 = 252561657, x3 = 2525 555.51, 
y, = 5711 619.35, yz = 5710664.92, y3 = 5711 684.14, 
(F, F2) = 304°53’24", = (F2F3) + 92 = (F2Fi), (F2F3) = 93°25'39”, 
(F3F,) + 93 = (F3F2),  (F3Fi) = 186°06'57", = (F, F2) + g1 = (Fi Fs). 


1.63425 vy = 153°12’22” 
3 0.0469547 v3 = 97°20'08” 
Q, = 61°13'33” 0.549 176 vy, = 109°27'30” 

180°00’00” 360°00’00” 


The numerical values of the weights are g,; = 1.10806, g. = 0.27667, 23; = 5.69188 and the 
coordinates of the new point are x = 2525463.25, y = 5711 634.13. 

Backward section has particular significance for ships and aircraft in determining their own 
positions. 


Cotangent 


— 1.98019 
— 0.128734 
— 0.353 300 


Calculated 


2 = 31°27'45” 
@3 = 87°18'42” 


260 11. Plane trigonometry 


Hansen’s problem. If two fixed but inaccessible points F,; and F, are given, for example, the tops 
of two towers, then two new points N, and N, can be determined if from each of them the other 
new point and the directions to the two fixed points can be observed (Fig.). With the notation of 
the figure the solution is obtained using the sine rule, provided that the angles g and y can be cal- 
culated. Since the angles 9 in the two triangles N,N,S and F,F,S are equal, it follows that: 
(pt p/2=(+ y)/2= 4. 

Half the difference of the required angles can be found in the following way, using an auxiliary 
angle: 


; _ siny | 
AF, F2N, : |NiFi| * Sin B ) 
; = sin (a + B+ y) 
AFiN2N, : |NiN2| = |N1F1| sin y ] 
ae sin(a+ B+ y)siny | 
- sin B sin y , 
; _ sing | 
AF, F2N2: |N2F2| ing’ 
sin(a +y+ 4 
AF,N2N, : |NiN2| = |N2F2| metry 


a sin(a+y-+0)sing@ | 
sin « sin 6 ' 
sin p sin x sin 6 sin (« + B+ y) 11.3-30 Hansen’s problem 
sin y = sin B sin y sin (a + y + 0) = oral: 
The auxiliary angle 7 is known up to a multiple of 180°. By addition and subtraction and using 
trigonometric relationships, remembering that cot 45° = 1, one obtains: 
sing —siny _coty—1 2 cos [(p + y)/2] sin[(y — y)/2] _ cot 45° cot — 1 


sng+siny  coty+1’ 2sin[(~+y)/2]cos[(~—y)/2]  coty+ cot 45° ’ 
tan [(y — y)/2] = tan [(y + y)/2] cot (45° + n), 


and hence the value (y — p)/2 = €, is known. Then 9 = &, + €2, yp = €, — &2. Consequently 
the line segments |F,N,|, |N,N2|, and |N2F,| can be calculated. For the direction angles one finds: 
(F,N2) = (FiF2) + @, \x 
(N2F;) = (FiN2) + 180°, 
(N2F2) = (N2F;) + 9, ~ 
(N2N1) = (N2F 1) — ¥, heme) 
(FLN,) = (F2Fi) — y, TK / 
(N, F2) = (F2N,) + 180°, 
(NF) = (MF 2) — B, 
(Ni N2) = (MF 2) + &. 


From the direction angles and the 
lengths the coordinate differences can 
be found (see forward section), and 
hence the series of coordinates for the 
points F, ~ N, > Nz > F2, which must 
give the known values for the coordi- 
nates of F-. 2- 


Polygonal arcs. In addition to — 11.3-31 Polygonal arc -— ——— = i 
points already determined trigono- 
metrically, the coordinates of further 
points can be calculated by measuring lines and angles. If P,, P2, P3,..., P, are the vertices 
of a polygonal arc starting at the known point P,, then the lines s; = |P,P2|, s2 = |P2P3|, and so 
on, are measured with a measuring tape and at every point the angle of deviation #,, B2, .-. is 
measured. This angle is the difference between the direction of the preceding segment and that of 
the following one, measured in the clockwise sense (Fig.). For the first measurement at P, the direc- 
tion towards another fixed point F, is taken as the direction of the preceding segment. By the 
measurement of the angle of deviation at P, the polygonal arc is connected to a known direction. 
The accuracy of the polygonal arc measurement can be appreciably increased if the coordinates 
of the last point P,, are known and a further fixed point F, can be sighted from P,,. The polygonal 
arc then connects the two given directions (F,P,) and (P,F2). The direction angles can then be 


12.1. Great circles, small circles and lunes 261 


calculated by adding the angles of deviation: 

(P,F;) = (F,P}) + 180°, —~(P;P2) = (P,F;) + B1, 

(P2P,) = (P; P2) a 180°, —w(P2P3) _ (P2P;) i Bo, 
and so on. The coordinate differences 4x; and Jy; between the point P; and the point P;,, are 
given by transforming from polar coordinates (P;P;41), s; to Cartesian coordinates; for example, 


Ax, = X2—-— X= |P,P;| = S$; cos (P,; P2) and Ay, = 2-1 = |P; P2| = S$, sin (P;P2). 

The signs of the coordinate differences depend on the magnitudes of the direction angles; in the 
figure these are denoted by v; = (P;Pi41); 71 = (P1P2) lies in the first quadrant, vz = (P2P3) 
and v3 = (P3P4) in the second quadrant; thus, 4.x, is positive, but 4x2 and 4x3 are negative. 


12. Spherical trigonometry 


12.1. Great circles, small circles and lunes 261 12.3. Applications of spherical trigono- 


12.2. The spherical triangle ............ 262 CGY av ccd teh coed oad ausouseas 272 
The main theorems for the solution Mathematical geography ......... 212 
of the general spherical triangle ... 264 Spherical astronomy ............. 276 
The basic problems for the general 
spherical triangle ....... 0.0.0.0 ue 266 


The right-angled spherical triangle .. 270 


As its name implies, spherical trigonometry is concerned with the solution of triangles on the 
surface of a sphere. It has developed from astronomy and navigation, with the task of determining 
the positions of points and the distances between them and also angles on the celestial sphere or 
on the surface of the earth, regarded as a sphere. The basis of the Gauss-Kriiger coordinates, which 
are important in surveying, is also obtained from astronomical measurements. 


12.1. Great circles, small circles and lunes 


Every straight line through the centre M of a sphere cuts its surface in the extremities of a diameter 
whose length is twice the radius R of the sphere. Every plane perpendicular to a diameter and at 
a distance / (less than R) from the centre M cuts the sphere in a circle of radius r = /(R* — 1”). If 
this plane contains M, then the intersection is a great circle with r= R. For /= R one obtains a 
tangent plane that has only one point in common with the sphere since r = 0. 

A pencil of planes may be made to pass through two points A and 
B on a sphere that do not lie on a diameter, and this cuts the sphere 
in a pencil of circles (Fig.). Among these circles there is a smallest one, 
for which the line AB is a diameter, and a largest one whose centre 
coincides with the centre of the sphere. This single circle having a 
radius equal to that of the sphere is called a great circle; all others 
are small circles. If all the planes of the pencil, together with their 
circles of intersection, are rotated about the line through A and B 
into the plane of the great circle, a family of coaxial circles through 
A and B is obtained. The smaller arc between A and B on each of these 
circles clearly is the smaller, the larger the radius r of the circles; thus 
it has its smallest value for the great circle with r= R. By means 


of differential geometry it can be shown that the arc AB of the great  12.1-1 Circles through two 
circle is not only the shortest circular arc joining A and B, but also the points A and B on the 
shortest of all curves on the sphere connecting A and B. It is a portion sphere 

of a geodesic line. 


Great circles. All distances between points on the sphere are measured along arcs of great circles. 
On spheres of sufficiently large radius these go over arbitrarily closely into the Euclidean distance 


along a straight line. From the theorems of plane geometry the length of the arc AB of the great 
circle between the points A and B depends on the magnitude of the radius R and on the angle sub- 
tended by the arc at the centre, which can be sciatica 

given in radians or degrees and is usually denoted 
by, a small Latin letter, for example, by a or a’°. 


262 12. Spherical trigonometry 


Two great circles intersect in two points A and B that are the ends of a diameter. Such points, in 
which a straight line through the centre of the sphere meets its surface, are called diametrically 
opposite points, or poles. The great circle whose plane is perpendicular to the straight line AB is 
called the polar circle of A (or B). If one describes the polar circle in a definite sense, one can 
distinguish between a left-hand and a right-hand pole. 


12.1-3 Spherical 
circle, latitude circle 


12.1-2 Angle « between two great circles, tr tangent plane 


Every plane perpendicular to the diameter AB (Fig.) cuts each of the planes of two great circles 
through A and B in a straight line, and the angle « between these two lines is the angle between 
the planes. The tangents to the two great circles at a pole are both perpendicular to the diameter AB, 
and the angle between them is also equal to «. This is the angle « between the two great circles. 


Spherical circles. All points of a sphere that lie at the same distance from a point P, measured 
along a great circle through P, lie on a circle called a spherical circle. The constant spherical distance 
is called its spherical radius and the point P is its spherical centre or pole; for example, all latitude 
circles of latitude y are spherical circles. They have spherical radius (90° — q) and the pole is their 
spherical centre. The greatest spherical circle is the polar circle p for which the pole is the spherical 
centre; its spherical radius is 7/2 or 90°. The other spherical circles are small circles that are inter- 
sections with the surface of the sphere of planes parallel to the plane of the polar circle. If the spherical 
radius of a circle is r° and the radius of the sphere R (Fig.), then the radius of the circle in the inter- 
secting plane is 9 = Rcos (90° — r°) and the circumference is 27@ = 27R cos (90° — r°). Thus, 
the circumference of a latitude circle is 279 = 27R cos 9. 


Lunes. Two great circles always have a pair of diametrically opposite points in common and 
divide the surface of the sphere into four lunes. Each of these has two equal sides of magnitude 
s = 180° (or z). The magnitude of its area depends only upon the angle « between the great circles. 
The Gauss-Kriiger projection uses lunes having an angle of 6°. For an angle of 90° (or 7/2) the area 
Ao of the lune is a quarter of the surface area of the sphere and is therefore 7R?. For an angle of 
«° (or &) the area is, by proportion, equal to A = 2R?a°/90° (or 2R?&). Thus, a Gauss-Kriiger 
meridian strip has the surface area 


A = 1R? - 6°/90° = 2R?/15 = 8501665 km? 
(if R is taken to be 6371.221 km). 


12.2. The spherical triangle 


If three points A, B, C lying on a sphere are such that no two of 
them form a pair of diametrically opposite points and they do not 
all three lie on one great circle, then they determine three great 
circles, each of which joins two of the points, and which also inter- 
sect in pairs inthe points A, B, C diametrically opposite to the given 
points. By these circles the surface of the sphere is divided into 
eight portions, each of which is bounded by arcs of the three great 
circles that are less than z (Fig.). These regions are called spherical 
triangles, in particular Euler triangles to distinguish them from 
triangles in which sides greater than z are possible, for example, 12.2-1 Spherical triangle 


12.2. The spherical triangle 263 


the triangle with sides 4B, BC and CACA in the figure. This non-Euler triangle differs from the 
hemisphere bounded by the great circle CACA only by the Euler triangle ABC. For this reason 
only Euler triangles will be considered here. The angles «, 8, y of the triangle are the angles 
between the planes of the great circles that intersect in pairs at the vertices of these angles; they 
are also the angles between the tangents at the vertices to the great circles that intersect at these 
points. In Euler triangles no angle exceeds 2. 


Area of a spherical triangle. Each pair of the eight spherical triangles with vertices at diametrically 
opposite points is symmetrical about the centre of the sphere and therefore all their data and their 
areas are equal. For example, A ABC = A ABC, or AABC = A ABC. Each triangle having a side 
in common with the triangle ABC forms with it a lune whose area can be stated. From 


A ABC + ABCA = 2R74, AABC + ACAB = 2R78, 

A ABC + AABC = 2R?75 
it follows that 

3A ABC + [ABCA + ACAB + AABC] = 2R7(6 + 6 +2). 
By the symmetry about the centre 

A ABC + [A BCA + ACAB + A ABC] = AABC + ABCA + ACAB + A ABC = 22R? 
(a hemisphere), and thus 

2A ABC + 2nR? = 2R7(6 + 6 +4), 


or A ABC = R*(& + B+ 9 — 2) = (2R?2/180°) (a° + B° + y® — 180°). 


The excess of the sum of the angles of a spherical triangle over 7 (or 180°) is called the spherical 
excess €. One obtains: 


It follows that in every spherical triangle with non-zero surface area the angle sum is greater than two 
right angles. For example, in an Euler triangle whose vertices are poles of the opposite sides the 
angle sum is 3 right angles = 37/2 = 270°. 


Polar triangle. Corresponding to each spherical triangle a three-sided solid angle can be deter- 
mined by the vectors A, B, C (of length R) from the centre M of the sphere to its vertices A, B, C. 
Spheres with different radii and the same centre M cut the rays determined by the vectors A, B, C in 
similar spherical triangles that have the same sides and angles. It may therefore be assumed that the 
vectors are of length 1. Now P,, P,, P. are the feet of the perpendiculars from a point P in the 
interior of the three-sided angle to its three bounding planes. These perpendiculars determine the 
polar solid angle of the given angle. The magnitudes of its sides are measured by the angles ¢, a 
and 4; x PaPP, = c, x PpPP. = a, X P-PP, = 5 (Fig.). The plane face PP,BP., for example, is 
perpendicular to the faces MBC and MAB of the original solid angle, and is therefore also perpen- 
dicular to their line of intersection B. The angle P,BP.. is the 
angle B between the plane surfaces of the sides a and c. In the 
quadrilateral PP,BP. it therefore follows that 5 + B = 180°, 
since the other two angles are right angles. Similarly one finds 
that ¢ + y = 180° and d + « = 180°. If one chooses a point 
within the polar solid angle, for simplicity the point M, then 
the vectors A, B, C are the perpendiculars from it to the sides 
a, b, @ of the polar solid angle, and 4, B, C are the feet of 
these perpendiculars. Thus, the original solid angle MABC is 
the polar solid angle of its polar solid angle. Its sides a, b, c 
are perpendicular to the line segments PP,, PP,, PP., 
respectively. The angles &, 8, # of the polar solid angle are 
contained in the rectangles MBP,C, MCP,A, MAP-.B, respec- 
tively (¢ = {BP,C, B = <CP,A, 9 = L AP,B), and thus 
& + a= 180°", 8 + b= 180°, 7 + c = 180°, since in each 
case the other two angles are both right angles. 


12.2-2 Polar solid angle PP.P,P. of 
the three-sided solid angle MABC 


264 12. Spherical trigonometry 


If the arbitrarily chosen point P approaches the point M, 
then PP,, PP,, PP, become perpendiculars to the sides a, 
b, c, respectively, each cutting the sphere in two diame- 
trically opposite points. The points A’, B’, C’ are the left- 
hand poles of the sides of the given triangle described in 
the following sense: A— B, B-— C, C— A. Then the 
spherical triangle A’B’C’ is called the polar triangle of 
the given triangle; the sides and angles of the two tri- 
angles are connected by the relationship given above 


(Fig.). 


[2.2-3 Spherical triangle, three-sided solid angle, polar triangle 
and polar solid angle 


The main theorems for the solution of the general spherical triangle 


The cosine rules for the sides and for the angles. On a sphere with centre at M and radius 1, let 
the vertices of the triangle ABC be the end-points of the vectors A, B, C from M for which 
|A| = |B| = |C| = 1, AB = cosc, BC = cosa, CA = cos b. Also, let tac, tap; tapas acs tees tea 
be vectors of the length 1 on the tangents to the great circles at the vertices A, B, C. Each pair deter- 
mines a tangent plane in which the corresponding angle of the triangle can be measured; 

sin x = |t4p X tacl sin B = |tgc X teal, Siny = |tc4 X teal. 

In the figure the side b = AC appears in its true magnitude and the tangent plane through t,- and 
tr) is perpendicular to the plane of the diagram. If this plane is rotated about t,¢ into the tangent 
plane, then the magnitude of the angle «© between f4¢ and £9) is equal to the true value of the 
angle «. In the plane through two vectors, for example through A and C, the tangent t,¢ at one 
point cuts the extension of the other vector C at H,. In the figure the triangle AMH, lies in the 
plane of the diagram. Using the auxiliary point H, and the intercept theorems one finds that 
|MA,|/|MC| = |MA\/|MH3I, where |MH,| = 1/cos 6 and |AH,| = tan b. By vector addition one 
obtains MA + AH, = MH, or A + tac tan b = C/cos b, t4c tan b = C/cos b — A. Similarly in 
a plane through A and B one obtains t,, tanc = B/cos c — A. By equating the scalar product of 
the vectors on the left-hand side of the two equations with that of the vectors on the right-hand 
one derives 


(tac * tap) tan b-tanc H 
C:B C:-A B-A Tt 
= ——_— + A:-A— a ; 
cos bcosc cos b cos c 
cos « tan btanc 
_ cos a _ cos b cos c 
~ cos bcosce cos b cos c’ 


cos « sin b sinc = cos a — cos bcosc. 

By cyclic permutation one obtains the cosine 
rule for sides, when the sides and angles are 
less than 2 (or 180°). 


Applying this result to the polar triangle 
ABC one obtains cosd=cos bcosé 
+ sin 5 sin cos &, for example. From the 
relationships between triangle and polar 
triangle, that is, from @ = 180° — «, b= 180° 
— B, € = 180° — y, & = 180° — a, it fol- 
lows that —cos« = (—cos f) (—cos y) 
+ sin B sin y(—cos a). 12.2-4 Vector representation of the spherical triangle 


12.2. The spherical triangle 265 


Hence the cosine rule for 
angles is obtained by cyclic 
permutation. 


The sine rule. The relationship ¢,, tan c = Bycos c— A and ¢ tac tan res = ‘C/cos b— A are used 
to derive the cosine rule for sides. Multiplication by cos c and cos b, respectively, leads to t,4, sin c 
= B—Acosc and fycsinb = C— Acosb. Substituting these values in the vector product 
tap X tac = sina: A one obtains 
sin bsinc: Asinn = B x C— cos b(B X A) — cosc(A X C)+ cos bcos c(A X A), 


where A X A =O and the vectors B X A and A X C are perpendicular to A. Since A-(B x A)=0 
and A:-(A X C)=0, scalar multiplication by A and subsequent cyclic permutation gives the 
three relationships 


sinbsincsinn = A:(BXC), sincsinasinf = B-(CX 4A), sinasinbsiny = C:(A x B). 
Because the triple scalar product is unaltered by cyclic permutation of the vectors, the three right- 


hand sides have the same value. Equating the left-hand 
sides, 


sin b sinc sin« = sinc sina sin B = sina sin bsiny 
leads to the sine rule. 


Half-angle and half-side formulae. The formulae corresponding to the half-angle formula of plane 
trigonometry can be used in the same way, to calculate the angles from three given sides and, con- 
versely, the sides from three given angles. From the cosine rule for sides and with the help of tri- 
gonometric relationships: 


hs ] 1 sinbsine+cosa—coshcosc | These make use of the following 
cos* — = — (1 + cosa) = > -————_————_ | facts: 
2 2 2 sin b sinc ; 
| sa—cosbcose 
cos a — cos (b+ c) cos (6 + ¢) — cosa Pa a ah al 
— = sin b sine 


2sinbsinc sai 2sinbsine 


eee cos p — cosy 
‘sin [(b + ¢ — a)/2) sin s sin (s — a) 


—_ _  _ Pry an 


sin b sine sin b sine | =—2sin 5 sin 4 

_ » & i 1 sind sinc+cosbcos¢—cosa 
sin? — = — (1 — cosa) = = +» — i — 

z 2 2 sin 6 sin-c 

cos (6— c) — cosa 

2sin b sine 

sin (a+ c—6)/2)sin (a+ 6—e)/2] sin (s—b) sin(s—c) 

ia ss gin A sine a sind sine 


The half-angle formula is obtained by division, using the fact that tan («/2) = sin («/2)/cos («/2). 


The half-side formulae are polar to the half-angle formulae. In the polar triangle ABC the half- 
angle formula gives tan? (8/2) = sins OsnG 2) By substitution, using the relationships 
sin § sin (§ — 5) 
o= ‘/2(« + B+ 7); B= 180° — b, G = 180° — «a, b = 180° — B, ¢ = 180° —y, = "plat b+e) 
= 270° — 0, § — d= 90° — (o — a), §—5b=90°—(o— f), §—&= 90° —(o—y), one ob- 
b __ cos (6 — y) cos (6 — «) 


tains cot? — 
2 —cos o cos (o — P) 


266 12. Spherical trigonometry 


Napier’s analogies. For the complete solution by logarithms of a spherical triangle, given two 
sides and the included angle or two angles and the side between them, the so-called Napier’s analogies 
are available. They can be derived from the half-angle or half-side formulae, using trigonometric 
relations, in particular those concerning sums and differences of trigonometric functions. It suf- 
fices here to give one of each of the sets of three formulae that follow from one another by cyclic 
permutation. 


For frequent use of Napier’s analogies the following mnemonic rules are suggested: all arguments 
are halved; if the tangents or cotangents have sides as arguments, then the sines and cosines have 
angles (and conversely); the function of the half-side is related to the half-sum or half-difference 
of the other two sides, and similarly for angles. It is easy to formulate other precise mnemonic 
rules. 


The basic problems for the general spherical triangle 


Unlike the plane triangle, the spherical triangle is also determined by three angles, so that there 
are six basic problems. For their solution one uses general relationships in the Euler triangle. 


Limit passage to plane trigonometry. Three points A, B, C in space that do not lie in a straight 
line, determine a plane and in it one plane triangle. They can, however, form the vertices of a spherical 
triangle on infinitely many different spheres. If these are ordered according to increasing radius R, 
then as R— oo, the spherical triangle tends continuously to the plane triangle, each spherical 
angle tending to the corresponding plane angle, and the spherical excess becomes arbitrarily small. 
To the lengths of the sides d, 5, Z, in the plane triangle there correspond the sides a/R, b/R, ¢/R, 
measured in radians, of the spherical triangle. In the formula 

tan (é/4) = y{tan (3/2) tan [($ — @)/2] tan [(($— 5)/2] tan [($ — ©)/2]}, 
which was obtained by L’Huilier, the tangent of the angle may be replaced by its radian measure 
because the angle is small. This gives 

é_i1j/[5 6-9 GC—) G—4 

4 4 


R R R R 


The area A of the spherical triangle then becomes the area given by Heron’s formula for a plane 
triangle: 


A = &R? = (é/4) - 4R? = [56 — a) 6 — 5) (§ — 2). 
For large, but still finite, values of R a theorem due to LEGENDRE holds: 


Theorem of Legendre: A spherical triangle with small sides and therefore small spherical excess 
has approximately the same area as a plane triangle with sides of lengths of the same absolute value. 
Each angle of the plane triangle is smaller than the corresponding angle of the spherical triangle by 
approximately one third of the spherical excess. 


Using L’Huilier’s formula the spherical excess « can be found for a triangle on the earth (of 
radius R) with sides a = 31.075 miles, b = 37.290 miles and c = 43.505 miles (for example, be- 
tween Cambridge, Luton and Corby). The magnitudes of the sides are given in radian measure by 
G = a/R, 6 = b/R, é= @/R, or in degrees by a° = 360°G/27R), b° = 360°B/(27R), c° = 360°¢/(27R). 
or alternatively pana by multiplying a by 206204.8”, since 1 radian corresponds to 206264.8’’ 
The results to the right , P i 
show that the spherical Miles s/2 = 24°16.85 
excess is € = 7.6’. By Le- 31.075 (s — a)/2 = 1047.47” 
gendre’s theorem the tri- 37,299 (s— b)/2 = 8°5.62” 

(s—c)/2= §°23.75" 


Radians 


0.0078479 
0.0094173 
0.0109869 


Seconds 

1618.75” 
1942.46” 
2266.21" 


angle may be regarded as 43.505 
plane as long as the ac- 
curacy of the measured angles is not less than e/3 » 2.5”. 

To derive the limiting form as R— oo of the sine and cosine rules the trigonometric functions 
are expanded in convergent series. Writing 4/R=q,, b/R=q,, ¢/R= 4c, one obtains sin q, 


12.2. The spherical triangle 267 


= g, — 93/3! + ++ = gall — q2/6 + 6,] and cos gq, = 1 — g2/2! + 62, where 6, and 62 are of 
the order of 1/R*. For the sine rule of spherical trigonometry this gives: 


sina: sinB:siny = a[1 — g2/6 + 6,]: b[1 — g2/6 + 63]: [1 — 92/6 + 5s], 


so that in the limiting case sin « : sin B: siny = a: b: é. This is the sine rule of plane trigonometry. 
Similarly, for the spherical cosine rule one obtains the cosine rule of plane trigonometry: 
COS Gg, = COS q, COS g, + SiN q, SiN g, COS &% OF 


[1 — 92/2 + 62] = [1 — qp /2 + 64] [1 — g2/2 + 06] 
+ Gv9_ cos «[1 — 2/6 + 63] [1 — gz/6 + 4s], 
—1/,q2 + 62 = —"/2(q2 + 92) — 54,6 + 94- cos x[1 — (9g? + 92)/6 + 43,5], 
a? = b? + é? — 2bée cosa. 
General relationships in Euler spherical triangles. Since no angle and no side can be greater than 2 
(or 180°), the arguments are given uniquely by the tangent, cotangent and cosine functions; on the 


other hand, two values are given by the sine function. If two arguments are possible, the geometrically 
correct solutions are selected from the theoretically possible ones by means of inequalities. 


]. In an Euler spherical triangle the sum of the angles lies between a and 3x and the sum of the 
sides between 0 and 22: 
a<at+Pt+7<3a and 0<d4+56+l< 2a 
or 180°<«+P+y< 540° and 0<a+6+ce¢< 360° 
2. The greater angle lies opposite the greater side. 


If a > b, for example, in the Napier analogy 4c) cot (y/2) sin [(a — b)/2] = tan [(« — B)/2] sin [(a + 6)/2], 
that is, sin [(a — b)/2] > 0, then because sin [(a + b)/2] > 0 and cot (y/2) > 0 in Euler triangles, 
it follows that tan [(« — B)/2] > 0 also. But this means that (« — B) > 0 ora > B. 


3. The sum of two sides is greater than the third. The difference between two sides is smaller than 
the third. 


Corresponding to each spherical triangle there exists a solid angle. This degenerates into a plane 
circular sector when the sum of two sides is equal to the third side, and is impossible in space if the 
sum is smaller than the third side. If the difference between the two sides a and 5 is greater than or 
equal to the third side c, a — b > c, then it would follow that a > b+ c, in contradiction to the 
first part of the theorem. 


4. The sum of two angles is less than the third increased by x (or 180°). 


As has just been shown, in the polar triangle ABC: 
G+b>é and 4—5<é. 

Because d = 180° — a, b = 180° — B, @ = 180° — y this means that for the triangle ABC: 
180° — « + 180° — B > 180° — y, and 180° — « — 180°+ 8B < 180° —y, 
180° + y>a4 B, and PB+y< 180° + «4. 


5. If the sum of two sides is greater (or less) than two right angles, then the sum of the two opposite 
angles is greater (or less) than two right angles. 


In Napier’s analogy 3c) cot (y/2) cos [(a — b)/2] = tan [(« + B)/2] cos [(a + 5)/2], leta+ b>, 
so that cos[(a + 5)/2] < 0. Because cot (y/2) and cos [(a — b)/2] must be positive in the Euler 
triangle, it follows that tan [(a + £)/2]< 0. This means that (« + B)/2 >2/2 or a+ B>2. 
Similarly it follows from a+ b< athata + B6< a, and froma+b=athata+fp=2. 


The basic problem 1a. In the spherical triangle ABC the three 
sides a, b, c are given and it is required to find the three angles «, 
B, y. The sum of each pair of sides must be greater than the third 
and the sum of all three sides less than 360°. The solution is 
found by means of the cosine rule for sides (Fig.) cosa = 
(cos a — cos bcos c)/(sin 6 sin c) or by the half-angle-formulae 


_ a sin (s — 5) sin (s — c) 
PNG Bez, tan = |/( sin s sin (s — a) I 


The formulae for cos f, cosy or for tan (8/2) and tan (y/2) are ob- 
tained by cyclic permutation. 


12.2-5 Solution of a spherical triangle, given three sides, or three angles 


268 12. Spherical trigonometry 


The basic problem 1b. In the spherical triangle ABC the three angles «, 8, v are given, such that 
the sum of two angles is less than the third increased by 180°, and the sum of all three angles lies 
between 180° and 540°. The sides can then be calculated either by means of the cosine rule for angles: 


cos a = (cosa + cos f cos y)/(sin B sin y), ..., 
or by the half-side formulae: 
_ a — cos 0 cos (o — «) 
to=atbty, ne = (poe) 


The basic problem 2a. If in the spherical triangle ABC two sides and the included angle, say ), 
c and «, are given, the third side is found from the cosine rule for sides: 


cosa = cos bcosc + sin b sin ccos«. 


With this side the remaining angles B and y can then be found from the sine rule: 
sinB = sinbsina/sina and siny = sinc sina/sina. 


From each sine function one obtains two corresponding arguments, which are supplementary. 
However, using the theorem that the greater angle lies opposite the greater side, the angle f that 
corresponds to the given values of the problem can be uniquely determined as the angle that is 
greater or smaller than the angle «, according as the side b is greater or smaller than the side a. 
The angle y is similarly chosen so that y = « according as c = a, where either the two upper or the 


two lower inequalities hold simultaneously. For logarithmic calculation Napier’s analogies are 
available. By 3a) and 4a) one obtains: 


pee ee ee ee 
- en ' tan 5} = cot > COS 7 | cos 5} 
p= 9 m , b—C 4, b+ e 
and tan —,— = cot > sin —3— | sin —, 


From these the angles (8 + y)/2 and (§ — y)/2 and hence f and 
» can be calculated. The remaining side a is given by the sine rule 
sin a = sina- sin 5/sinf; from the two values a of the Arcsin 
function, that one is chosen that is greater or less than c, accord- 
ing as x is greater or less than y. 


Example: It is given that a = 52.5°; b = 107.8°; y = 141.5° 


(Fig.). : 
By Napier’s analogies 3c) and 4c): 


i — 


ae , a—b a+b 
12.2-6 Solution of a spherical tan ese = cot S cos 5 | cos ; 
triangle with a = 52.5°, 
h = 107.8" and + 141.5° x.— p — ae in ves h | sin = 
meu GE axe 2 2 
lg sin Ig cot Ig cos sinc = siny sin a/sin x 
1/,(a — 6) = —27.65° | 9.6666 n 9.9473 Igsiny = 9.7941 
a—b= —55.3° Igsina = _9.8995 
—e b= 107.8° 9.6936 
—s a= §2.5° Igsina = 9.8948 | 
a+b= 160.3° | lgsinec = 9.7988 
‘/,(a+b)= 80.15° 9.9936 9.2331 | cy = 38.99° 
9.6730 n ee 0.7142 c> = 141.01° 
— > ‘y= 70.75° 9.5431 (| acyrace 
} Ig tan Ig tan | Solution: ¢, = 141.01° 
1/,(~ — Bp) = —9.34°—<+— 9.2161 0 ge) tae aos 
1/5(0 +B) = 61.06°<—_____________"” 
a=  51.72° 
B= 70.40° 
y= 141,5°- 


x+B+y= 263.62° 
e= 83.62" 


12.2. The spherical triangle 269 


Basic problem 2b. Two angles of the spherical triangle and the side between them are now given, 
for example, 8, y and a. The problem and hence also the solution is polar to the basic problem 2a. 
It is therefore sufficient to compile the formulae. 


I. The cosine rule for angles gives the angle «: cosa = —cos B cosy + sin B sin y cos a. The sine 
rule gives b and c: sinb = sinf sina/sina, sinc = siny sina/sina, where b 2a according as 
B 2a, and c 2 a according asy 2] «. 

II. Napier’s analogies 1a) and 2a) give the sides } and c: 

b = = _ 
tag co * | cos a : z sin PP 


7 5 5 and tan 5 = tan > sin 5 7 


The angle « is given by the sine rule sin « = sin a sin B/sin b, where « 2 B according as a 2 b. 


tan 


Basic problem 3a. If two sides and one of the opposite angles of the spherical triangle ABC are 
given, for example, a, c and y, then there is either no solution, or one or two solutions (Fig.). The 
three cases are illustrated in the figure by the three spheri- 
cal circles k,, k,, and k3 about the point B with differ- 


ent spherical radii c. The circle k, with radius BA, 
does not intersect the great circle through C and 4,; k, 
touches it at the point A;, and k3, on the other hand, 


cuts it at A, and A,. With the length of side BA; =C 
there is one solution, the right-angled triangle A3;BC. 


With side BA, = BA, = cc, however, one obtains the 
two triangles A,BC and A,BC. Because the triangle 
A,BA, is isosceles, it follows that «2 = <BA,A>2, or a, 
== 180° — «,. By calculation one obtains these two an- 
gles «, and «, from the sine rule sin « = sina sin y/sin c, 
since sin «, = sin«,. For each value of the angle « the 
side 56 and the angle f are then given uniquely by Na- 
pier’s analogies 2b) and 4b): 


tan = tan + sin 25 | sin 4 = 
2 2 2 12.2-7 Solution of a spherical triangle, 
B y—-a . c+a . e-a given two sides a and c, and one opposite 


The analytical discussion of the possible cases is based on the relationship sin « = sina sin y/sin c 
and the procedure is analogous to that followed in plane trigonometry (Fig.). 


I. (sin a sin y/sin c) > 1, so that sin « > 1; no real solution. 

Il. (sina sin y/sin c) = 1, sin« = 1, « = 2/2; one solution, for example, the triangle A;BC. 

III. sina = (sina sin y/sin c) < 1. 

II (1). sina < sine sin« < siny; one solution, 
since for each given triple of values (a,, c;, y;), i = 1, 2, 
the value of « is uniquely determined, because « 2 y 
according as a 2 ¢ (Fig.). 

Ili (2). sina = sine->sin« = siny; one solution 
[see III (1).]. 

111 (3). sina > sine > sina > siny; two solutions. 
Either a >c—>a>y, that is, c= c, (acute) > y = 7, 
(acute) and «,, «2, = 180° — «, are solutions; or 
ax<cr>acy, that is, c= cy, (obtuse) ~y=—y2 
(obtuse) and «,,*2 = 180° — «, are solutions (see Fig.). 


sina< sinc sina <siny 


Basic problem 3b. The polar problem, to solve a sphe- 
rical triangle ABC from two angles and a side opposite 
one of them, for example, from «, y and c, leads to the 
corresponding different cases. It is therefore sufficient to 
give the method of calculation without further discussion : : 
1. sin a = sin x sin c/siny; sing >sine€ Sin a> SinY 


2. tan (6/2) = tan [(c — a)/2] sin [(y + «)/2]/ sin [(y — «)/2]; 12.2-8 For the discussion of the solutions 
3. cot (8/2) = tan [(y — «)/2]sin [(¢ + a)/2]/sin[(c—a)/2]. of basic problem 3a 


270 12. Spherical trigonometry 


Example: Given that c = 96.5°; « = 101.2°; » = 102.1°; to find a, b, B (Fig.). 


1. la) 
sin a = sin « sin c/sin y 90. 95° (c + a)/2 
lg sin = 9.9916 i: 
+lg sine = 9.9972 48. 25° i= 
99ssg 42-2 i 
—lg sin y = 9.9902 5.55° | (c —a)/2 
| lg sina = 9.9986 101.65° | (vy + «)/2 
, a; = 85.40°, ag = 94.60 9 51.95° | y/2 
“ Since sin« > siny, case  §9.60° »/2 
’ TIL (3) applies with a < » ee & 
$ and there are two 0.45° (vy — x)/2 
ail solutions. 


lg tan [(c— a)/2] fo 
lg sin [((y+«)/2] | 


2. 8.9876 


3, 7.8951 Ig tan [(y—a)/2] 
49.9910 


+9.9999 || Ig sin [(¢-+-a)/2] | 


en | ee 
—8.9855 lg sin [(e—a)/2] 


8.9095 || Ig cot (8/2) 
85.34° B/2 
170.68° 6 


Tig tan (6/2) 
-b/2 


The right-angled spherical triangle 


By analogy to the procedure in plane trigonometry, the calculations in spherical trigonometry 
can also be simplified by the use of right-angled triangles. Polar to these are right-sided triangles, 
in which one vertex lies on the polar of a second vertex. However the right-sided spherical triangle 
is seldom used and need not be specially considered. 


Napier’s rules. In a spherical triangle ABC suppose that the angle y is 90°; then c is the hypotenuse 
( Fig.). Since sin 90° = 1 and cos 90° = 0, the theorems for the general triangle simplify to the follow- 
ing forms: 


Sine rule: sin a = sin « sin c/sin 90°, 


1. sina = sine sinc, (1) cos (90° — a) = sina sinc, 

2. sin b = sin f sin c, (2) cos (90° — b) = sin sinc. 
Cosine rule for sides: cos c = cosacos b + sin asin b cos 90°, 

3. cosc = cosacos b, (3) cos c = sin (90° — a) sin (90° — b). 
Cosine rule for angles: cos « = —cos fh cos 90° + sin f sin 90° cos a, 

4. cosa = sin f cos a, (4) cos « = sin (90° — a) sin B, 

5. cos B = —cos 90° cos « + sin 90° sin « cos b, 
cos 8 = sin« cos b, (5) cos B = sin (90° — 5b) sina. 


From these five relationships further ones can be found: 
6. from 4.: cos a = cot « : sin a/sin B and 5.: cos b = cot #- sin B/sin « 


it follows from 3. that cos c = cot « cot B, (6) cos c = cot « cot f, 
7. from 1.: sin « = sin a/sin c and 3.: cos b = cos c/cos a 
it follows from 5. that cos 6 = tana cot c, (7) cos B = cot (90° — a) cosc, 
8. from 2.: sin B = sin b/sin c and 3.: cos a = cos c/cos b 
it follows from 4. that cosa = tan b cot c, (8) cos « = cot (90° — 5b) cot c, 
9. from 5.: sin « = cos B/cos 6 and 2.: sinc = sin b/sin B 
it follows from 1. that sin a = tan b cot f, (9) cos (90° — a) = cot (90° — b) cot f, 
10. from 4.: sin 8 = cos a/cos a and 1.: sinc = sin a/sin « 
it follows from 2. that sin b = tana cota, (10) cos (90° — b) = cot (90° — a) cot a. 


Napier collected together the relationships (1) to (10) in the rules that bear his name. To formulate 
the rules, one visualises a triangle in which y is a right angle. Leaving the right angle out of con- 
sideration, the remaining two angles, the hypotenuse and the complements of the sides containing 
the right angle are called the circular parts of the triangle. These five parts, B, c, x, (90° — 5), (90° — a), 


12.2. The spherical triangle 271 


12.2-10 Right-angled 


o 
spherical triangle ABC (90%) (90a) 


12.2-11 Position of the c 
parts for Napier’s rules 


are arranged around a circle in the order in which they natu- 
rally occur in the triangle (Fig.). If any one part is selected, the 
two on either side of it are called the adjacent parts and the 


remaining two the opposite parts. Napier’s rules then have the following form: 


Napier’s rules. In a right-angled spherical triangle the cosine of any part is equal to the product of 
the cotangents of the adjacent parts, and is also equal to the product of the sines of the opposite parts. 
From the application of Napier’s rules to Euler triangles the arguments of all trigonometric func- 
tions except the sine function are given uniquely and that of the sine function is two-valued. From 
the general relationships in the Euler triangle one can distinguish whether one or two solutions 

exist. 
Example: If it is given that a = 38.4°, « = 42,9°, then one finds two solutions. From sin 6 
= cota tana, two values 5, and 6; = 180° — 6, are obtained and from cos « = cos a sin f, 
two values for f, which can also be calculated from cos # = sin « cos 5. Finally, the hypotenuse c 
can be determined from cos « = cot c tan b. 


Igcota = 0.0319 Igcosx = 9.8648 
Igtana= 9.8990 Igcosa= 9.8941 
Igsinb’ = 9.9309 Igsinf = 9.9707 
6, = 58.52° B, = 69.2° 
bs = 121.48° B, = 110.8° 
Igcosx = 9.8648 . 
Ig cot b;,.= 9.7870 caer pacer of a say 
angled spherical triangle, 
Igcotc = 9.6518 given a side a containing the 
cy = 65.85 right angle and the opposite 
cz = 114.15° angle a 


Altitudes. With the help of Napier’s rules altitudes of 
spherical triangles can be calculated. They are measured 
on a great circle through a vertex perpendicular to the 
opposite side. The altitude gives the spherical distance 
of the vertex from the side. An arbitrary spherical tri- 
angle is divided by an altitude into two right-angled tri- &. 
angles and can be solved using Napier’s rules. In this way 
Napier’s analogies can, in general, be avoided. Moreover, in applications the altitude often has a 
direct meaning. Suppose, for instance, that in the figure of the following example the great circle 
through A, C and F represents the earth’s equator and B is the position of a ship. Then / is the 
geographical latitude of this position. Or suppose that B represents the North Pole of the earth and 
an aircraft or ship is moving along the great circle through A and 
C; then A is its shortest distance from the pole and the side CF 
the path to the position at which this distance is reached. 


Example: In the triangle with sides c = 
84°, a= 42.7° and angle y = 135° (Fig.) ‘gsiny= 9.8495 


the altitude A = BF to the side b liesout- ‘8 Sin¢@= 9.8313 
9.6808 


side the triangle because the angle y is ob- 


tuse. By the sine rulesinx = sinysina/sinc, !gsine = 9.9976 


two values x; and «2 are obtained fora. Igsinn = 9.6832 
Since the smaller angle lies opposite the Oo; = 28.83° 
smaller side, only «, can be a solution. &. = 151.17" 


12.2-13 Solution of a spherical triangle ABC, given two sides 
a = 42.7°, c = 84° and one opposite angle y = 135° 


272 12. Spherical trigonometry 


In the right-angled triangles ABF and CBF the hypotenuse and an angle are given; by Napier’s 
rules one obtains: 


1. cos « = cot c tan AF and 2. cos (180° — y) = cot « tan CF 
tan AF = cosa tance tan CF = cos (180° — y) tana 
Igcosa = 9,9425 Ig cos (180° — y) = 9.8495 
Ig tane = 0.9784 Igtana= 9.965] 
lg tan AF = 0.9209 lg tan CF= 9.8146 
AF = 83.16° CF = 33.12° 


Thus, the side b has magnitude AF — CF = 50.04° 
By the sine rule applied Igsinb= 9.8845 


to the triangle ABC | Igsiny= 9.8495 
the angle # is given by ‘A Wy a 
_ 2 sin dsiny Igsinc = 9.9976 
Because § must be less p, = 33.02" 
than y, 8; = 33.02° Bz = 146.98 


is the only solution. 


Isosceles spherical triangles. If two sides of the spheri- 
cal triangle ABC are equal to one another, for example, 
if a = b, then the triangle is isosceles. Let F be the foot 
of the altitude A to the third side (Fig.). The calculation !2-2-!4 An isosceles spherical triangle 
of the altitude 4 by applying Napier’s rules to the right- 
angled triangle AFC with b and « must give the same value of / as from triangle BFC with a and 
B; since a = b it follows that « = B. With a = b and « = 8, it then follows that AF and BF have 
the same value, that is, vy; = yz. Thus, the relationships known in plane geometry also hold here. 

In an isosceles spherical triangle the altitude to the base bisects the base and the angle opposite 
to it. It is the perpendicular bisector of the base and a line of symmetry of the triangle. The base 
angles are equal to one another. 


A corresponding theorem holds for a spherical triangle with two equal angles. Such a triangle is 
also isosceles. 


12.3. Applications of spherical trigonometry 


Among the applications of spherical trigonometry, two call for special attention because of their 
practical importance. They are the applications to mathematical geography and to astronomy. 


Mathematical geography 


The form of the earth is, in fact, irregular and is called a geoid. However the deviations from one 
of the bodies amenable to mathematical calculation are small in relation to their size. The analysis 
of the paths of the artificial earth satellites has shown that a suitable ellipsoid with three axes gives 
the best fit for the geoid. In fact, the difference between the two axes lying in the equatorial plane 
is so small that it has not so far been determined by earth measurements. Accordingly in higher 
geodesy the earth is regarded as a spheroid (ellipsoid of rotation). The first precise calculations 
were made by Friedrich Wilhelm BEssEL (1784-1846). In 1924 the ellipsoid calculated by J. Hay- 
FORD (1868-1925) was internationally recognized. The most recent values were given by F. N. 
KRASOVSKII (1878-1948); they are used for work in geodesy in the USSR. 


Polar radius b 
km miles 


Flattening (a — b)/a 


Earth ellipsoid Equatorial radius a 


km miles 


Hayford 
Krasovskii 


6378.388 3964.194 6356.912 | 3950.846 | 1/297 
6378.245 3964.105 6 356.863 3950.816 | 1/298.3 


In a first approximation the earth may be regarded as a sphere of mean radius R = 6371.221 km, 
[lg R = 3.804 2227] or R = 3959.740 miles. 


12.3. Applications of spherical trigonometry 273 


Units of measurement on the earth sphere 


150i a STEAUCUCIE: aé24nvar tansae renee coed dee peng esas wes 111.20 km _— 69.111 miles 
1° On The CQUALOL i icine aes ed Bare RAS oOo eee eee La eee ek 111.32 km _ = 69.186 miles 
1 geographical mile = 1/15 equatorial degree................... 7.422 km = 4.613 miles 
arc length of a meridian quadrant......... ccc ccc ccc cece ecces 10 002.288 km = 6 216.462 miles 
mean arc length of a meridian degree ........ cc cc ccc ccc ceeeee 111.137km== 69.072 miles 
1 nautical mile or 1 mean minute on a longitude circle ......... 1.852 km 

1 knot = 1 nautical mile/hour ........... 0... ccc eee eee 1.852 km/h 


A point on the earth’s surface is determined by its longitude A and its latitude y, as has already 
been described in the derivation of the Gauss-Kriiger coordinates in Chapter 11. The meridians 
are great circles, but the latitude circles are not; their radius 9 is given by 9@= Rcos gy. Distances on 
the earth’s surface are measured along great circles, because they are geodesic lines and represent 
the shortest connections on the sphere. Bearings (or courses) are angles made with the meridian. 


Determination of distance and course. If two places P, and P, on the earth are given by their 
longitude A, , A, and latitude y,, yz then the great circle distance between them and the angles be- 
tween this circle and the meridians through P, and P, can be calculated. The formulae developed in 
the basic problems and Napier’s rules are available 
for the solution. 


Example: If an aircraft flies with air speed 
800 km/h = 497.2 m.p.h. from Leningrad (9, 
= 59.9° N; A, = 30.3° E = —30.3°) to San Fran- 
cisco (pp = 37.8° N; Ap = 122.4”) by the shortest 
route, then its path is the arc LF of the great 
circle through L and F (Fig.). On each of the 
meridians through the two places the arc from the 
equator to the place is given as the geographical 
latitude. The meridian arc from the place to the 
north pole N has magnitude (90° — ~) and the 
two arcs LN = 90° — g,, FN = 90° — gf form 
with the great circle arc LF a spherical triangle in 
which the angle 4A between the two meridians is 
known; 4A=A, — A, = 122.4° + 30.3° = 152.7". 
In the spherical triangle two sides and the in- 
cluded angle are given. The great circle arcg 12.3-1 Flight path from Leningrad LZ to San 
= [F is found by the cosine rule for sides: Francisco F (schematic) 


cos g = cos (90° — @,) cos (90° — p-) + sin (90° — 1) sin (90° — gp) cos AA, 
cos g = sin g, sin gr + cos gy, cos py, cos AA. 


Igsing, = 9.9371 lg cos py 9.7003 lg 2 = 0.7982 

Igsing; = 9.7874 lg COS Pp 9.8977 lg R = 3.8042 

— « igu= 9.7245 Ig cos AA 9.9487 n Ig g = 1.9017 

u= 0.5304 Igv= 9.54677 6.5041 

+e = —0.3522 p = —0.3522 lg 360° = 2.5563 

cos g = u+v= +0.1782 Ig § = 3.9478 
g= 79.74° § = 8868 km = 5511.5 miles 


This arc has ‘the length f= 22 Rg/'360° = 8868 km, where the approximate value 6371 km 
(3960 miles) is taken for R, and could therefore be traversed at the given speed in about 11 hours 
(11.08 h). 

The angles « and f in the spherical triangle 
are given by the sine rule, The aircraft leaves 
Leningrad on a course N 21.61° W and arrives 


sin ao = COS g, sin AA/sin 
sin B = cos gy sin AA/sin g 


; pcs | 
at San Francisco on a course S 13.52° W. Each Ig sin aa = 9.5662 —» B = 21.61 
meridian cuts the flight path at a different lg cos 

angle. The course angle steadily increases by lg sin nes = 9.6615 

144.87° from N 21.61° W to the final course | ie 6685 

S 13.52° W. At one point A of the flight path Ig sin g = 9.9930 

the aircraft is flying due west. It is then at its lg cos g, = 


nearest point to the North Pole. The point H —s«dgsi . 9.3688 —e oa = 13.52 


274 12. Spherical trigonometry 


ist the foot of the perpendicular A from the pole to the side LF. The altitude A divides the tri- 
angle LNF into two right-angled triangles. In the triangle LNA the distance A from the pole and 
the angle A; = <| LNA can be determined. On the meridian A,, = 40.78° W the aircraft is flying 
due west; it is then at its closest position to the pole, 1183 km away from it. It crosses the 
latitude circle of Leningrad later at the point B, at the same angle as in Leningrad; its course at 
this point is therefore $ 21.61° W. The meridian of the point B is given by Ap = Ay + A, = 40.78° + 
+ 71.08° = 111.86° W; thus, the geographical coordinates of the point Bareg, = 59.9° N and 
Ag = 111.86° W. The arc BH = LH can be found from ‘the right-angled triangle LNA using 
Napier’s rules. 
| cos (90° a Px) = cot Ay cot f, cot hi = tan B sin PL 

cos (90° — A) = sin (90° — ¢,) sin f, sin h = cos p, sin B 


Igtanf= 9.5978 IgsinB = 9.5662 cos § = tan LH tang, 
Igsing, = 9.9371 Igcosg, = 9.7003 tan LH = cot @, cos B 


Ig cot A, = ; Igsinh = 9.2665 Igcotp, = 9.7632 
A 71, h = 10.64° IgcosB = 9.9684 


Ig2xR = 4.6024 IgtanLH = 9.7316 


—Ig 360" = 2.5563 LH 28.33° 
2.0461 Ig (27 R/360°) = 2.0461 
Igh = 1.0269 IgLH= 1.4522 
Igh = 3.0730 be _ 3.4983 _ 
kh = 1183km & 735 miles LH = 3150km ~ 1958 miles 


Only at the point B, that is, after travelling a distance of LB = 6300 km or after a flight time of 
7 h 52 min 30 s (7.875 h), does the aircraft turn for the first time towards more southerly latitudes. 
The aircraft could also have reached the point B by flying along the latitude circle gp = 59.9° N 
that goes through Leningrad. This course would have cut all the meridians at right angles. However, 
the path 6 from ZL to B would have been longer, since it would not have been along a geodesic 
(great circle as shortest connection) but along a loxodrome (curve of constant bearing). The radius 
of the latitude circle is 9 = R cos g,. The arc 6 subtends an angle AA at the centre of this circle, 
and thus 6 = 2nR cos ¢, AA/360°. One obtains b = 8516 km (5293 miles) instead of 6300 km 
(3915 miles) along the geodesic, the difference being 2216km (1337 miles). It would have 
taken the aircraft about 2h 45 min longer to fly along the latitude circle. 

A body that describes the same path, but with the speed 


» = 8 km/s (4.97 m.p.sec.) of an artificial earth satellite, needs | 'B@TR/360)= 2.0088 
787.5 s = 13 min 7.5 for the arc LB and reaches San Fran- — ig AA = 2.1838 
cisco after 1108.55 or 18 min 28.5 s, if friction is neglected. —————————— 
Its path cuts the equator Q in two points E,; and E, which, — gb = 3.9302 
as intersections of two great circles, lie on diameter of the b= 8516 km 
sphere. Ift he intersection of the meridian of San Francisco ee 5293 miles 
with the equator@Q is denoted by C, then the spherical triangle cs ; > 
E,CF is right-angled at C. In this triangle the angle y = 13.52° a ie 
and the side CF = g, are known, and the side CE, can be poime = a rere : Pr 
found by Napier’s rules. The point E, has coordinates ; dirs = 9.7874 

pe, = 0, Ag, = Ap + 8.39° =130.79° W and consequently E, Dn ae 

has coordinates@e, = 0, Ag, = (130.79° + 180°) W = 310.79° W lg tan £,C = 9.1685 

or Ag, = 49.21° E. E,C = 8.39° 


Loxodromes. The advantage for a ship or an aircraft of travelling along a geodesic to reach its 
destination in the shortest time contains the disadvantage that the course must be altered throughout 
the journey, strictly speaking at every instant. A curve that cuts all the meridians at the same bearing 
angle « is called a loxodrome. A latitude circle is a loxodrome for the bearing « = 90°, a meridian 
one for « = 0°. In the general case, for an arbitrary angle «, there is a curve for which a transcen- 
dental function gives the relationship between the latitude g and the longitude A at every point. 
If one considers two neighbouring points A and B (Fig.) with the coordinates (A,q) and 
(A+ AA, p+ 4g) on a loxodrome I, and the latitude circle with radius @ = Rcos@ through the 
point A, then the arcs AC = RcosgAA, CB= RAg and AB= 4s form a right-angled tri- 
angle ABC. This, however, is not a spherical triangle (only R Ap lies along a great circle), but it 
may be regarded as plane if 4g and AA are chosen sufficiently small. The following relationships 
can be read off in this triangle: 
tana = AAcosg/4g, Ad/Ap=tana/cosp and Ascosa=—RAg, As/Ap= R/[cosa. 


12.3. Applications of spherical trigonometry 275 


In the limit as 4g— 0 these equations tend to two diffe- 
lee rential equations dA/dg = tan a/cosg and ds/dg = R/cos « 
(At AA, pt AG), in which the variables can easily be separated. By integra- 


tion the first yields the equation of the loxodrome: 
dA = tana(dg/cosg), A= tana[ln tan (2/4 + ~/2) + C], 
Az — A, = tan afin tan (27/4 + —2/2) — In (27/4 + ¢,/2)]. 


RAp 


A 


Ros para C eas 
! 12.3-2 Derivation of the loxodrome 


The second gives the arc length s of the loxodrome: 
ds = (R/cosa) dy, s = (R/cos«) (y2 — 9). 


As a closer examination of the first equation shows, the loxodrome 
circulates about the pole of the great circle that cuts the initial meridian 
at right angles at the starting point, and spirals around it in ever smaller 
windings infinitely often, without reaching it (an asymptotic point). The 
change in the latitude during one rotation becomes steadily smaller (Fig.). Ce 

From the second equation, the flight path of an aircraft from a point is 
on the equator where it cuts the meridian at the course angle « to the 12.3-3 Loxodrome 
point at which it reaches latitude y, has the length s = Rg/cos«. It 
is therefore the longer, the greater the angle «. For a flight to the north pole (p = 2/2) at a 
constant bearing « = 60°, for example, it follows that s = 2Rz/2 (since cos 60° = 1/2), while the 
shortest route along a meridian is of length s = Rz/2. Thus, the path along the loxodrome is twice 
as long. 


Determination of position by a fix. By means of a fix, the position of a ship, aircraft, or other body 
is determined from the directions from which signals are received that are propagated in straight 
lines and are as a rule not optical. Geometrically this is based on the same scheme as forward and 
backward sections in plane trigonometry. There are two kinds of fix: in one the directions of signals 
emitted from the object to be located are determined by two fixed ground stations and from these 
the coordinates of the object are calculated, and in the other the signals are emitted by two known 
ground stations and their directions observed and the position calculated at the object itself. 

In practice radio signals are almost always used. Whereas in surveying the precision of the angles 
can be increased by repeated measurements and the most probable value for the result calculated 
by the method of last squares, the fix rests upon a single measurement of the directions, which are 
of lesser precision. For this reason physical properties of the waves employed, for example, inter- 
ference, oscillations, or other methods such as radar, are also made use of. Above all, one is almost 
always concerned with a moving object. The position must therefore be determined by means of 
tables, by graphical methods, or by elec- ‘ 
tronic apparatus, so that the result i is avai- 
lable while the object of the fix is still in 
the neighbourhood of the position required. 
Thus, in practice, making a fix has become 
a physical and technical problem. It is 
sufficient to describe here a simple graphical 
procedure. 


A graphical method of making a fix. In 
order to obtain a clear picture, let the 
basis b be 60°. Let the measured angles f, 
and £, between the basis and the direction 
of the new point C at the points B, and B, 
be 8; = 60° and £, = 110°. The plane 
projection of the earth can be chosen in 
such a way that the image of the great 
circle through B, and B is a circle (Fig.); 
<< B.MB, = 60°. The projection of the 
great circle through #8, (and its dia- 
metrically opposite point B;) and also 
through the required Point C is an 
ellipse, whose major axis is |B,B;| and 
whose minor axis is therefore the per- 12.3-4 Graphical method of making a fix 


276 12. Spherical trigonometry 


pendicular to B,B; through M. The length of the semi-minor axis is the projection of the radius 
R= |MG" of the sphere, and is given as the intersection of two planes: firstly, the plane of the great 
circle through B,CB; that is inclined at an angle £, to the plane of the diagram J7, and secondly, 
the plane through M is perpendicular to JJ and B,B’,. If G is the projection of G’, then AMG’G 
is a right-angled triangle whose hypotenuse |MG’| = R and angle G’MG = #, are known, where 
MG is perpendicular to B,B; . In the figure this triangle is folded into A MGGpo in the plane of the 
diagram, and gives the length |MG| of the semi-minor axis (in general Gp + H,). From the semi- 
major axis MB, and the semi-minor axis MG every point on the ellipse (the projection of the great 
circle through B, , G and B;) can be constructed with arbitrary precision. For the great circle through 
the points B, and B,, whose plane is inclined at an angle £2 to the plane of the diagram, a similar 
result holds. Thus, one makes the following sequence of constructions: the perpendicular at M to 
B,B;; marking out of the angle 8; the intersection Hp of its free arm with the circle of radius R; 
the perpendicular from Hp to the perpendicular at M to BB, gives H; then |MA| gives the position 
and magnitude of the semi-minor axis of the required ellipse. The intersection C of two ellipses is 


the required point. To determine the true values of the sides s, = B,C and s2 = B,C their great 
circles need only be rotated about a diameter into the plane of the diagram. The projection of each 
point of each circle moves on a perpendicular to the axis of rotation. Thus, C moves to C, or C, 
and J{B,MC, = S§2, J B2MC, = S$;. 

The angle of inclination y of the planes of the two great circles indicated can also be deduced 
from the figure. The two planes intersect in the straight line CMC’. The polar circle with C as pole 


cuts both great circles at right angles, in the points D and E. The arc DE corresponds to the angle y. 
By rotating this polar circle about a diameter into the plane of the diagram the true magnitude of 
the angle y can be read off: (E,MD, = y. 


Spherical astronomy 


Apart from the method of the fix, the positions of ships and aircraft are, even today, found by 
means of the stars. It was once the only method of navigation on the high seas. Explorers in un- 
known lands relied on them alone. The necessary measurements were made with the compass, the 
theodolite, a mirror sextant or similar angle-measuring instrument and an accurate clock. Later 
wireless telegraphy was used to transmit time signals to check the clocks. Knowledge of the most 
important constellations is enough for an approximate orientation. For precise determination of 
position one must know data concerning the position of easily located stars and the motion of the 
sun, the planets, the moon and Jupiter’s moons, and the astronomical coordinate systems in which 
positions in the heavens are given. The data from spherical astronomy that are important for the 
purposes of navigation appear in the nautical and astronomical almanacs; the astronomical co- 
ordinate systems that are indispensable for navigation are the horizontal and the equatorial systems. 

Like all astronomical coordinate systems, these are based on the fact that the starry sky appears 
to an observer as a portion of a gigantic sphere called the celestial sphere. The position of each 
point on it can be fixed by two numerical coordinates (corresponding to longitude and latitude 
on the earth’s surface). Any great circle with its poles (pole and polar) is suitable as a reference 
system for these two coordinates. One angle is measured on this circle in a prescribed sense from a 
fixed point; the second is measured on a perpendicular great circle through the point whose position 
is to be fixed and the pole of the basic circle. 


The horizontal system. To an observer O on the sea or in flat country the night sky appears as 
a hemisphere bounded by the horizon A (Fig.). Mathematically the (apparent) horizon is the circle 
in which a tangent.plane to the earth at the point of observation cuts the celestial sphere. In relation 
to the distances of most of the stars the radius of the earth is negligibly small. The apparent horizon 
therefore coincides with the true horizon, which is the intersection of a plane through the centre 
of the earth, parallel to the tangent plane. The poles of the horizon are the zenith Z vertically above 
the observer and the nadir Na diametrically opposite to the zenith. To the observer the motion of 
a star appears to follow a path that begins on the horizon (it is said to rise at A), ascends to a peak, 
the upper culmination or transit point C, and then falls again to descend at D and finally sinks below 
the horizon and passes through the lower culmination point IC. The great circle through the cul- 
mination points of all stars is called the celestial meridian m. 

There also exist stars, the circumpolar stars cS, whose path lies entirely above the horizon. During 
one day all stars describe a small circle on the celestial sphere. ‘Their paths are parallel to one another. 
The centres of these circles all lie on a straight line that forms the axis of the celestial sphere. This 
cuts the celestial sphere in two points, the celestial North Pole Py and the celestial South Pole P,. 
This apparent circular motion of the stars is a consequence of the rotation of the earth about its 
axis, and the celestial poles remain at rest because the earth’s axis points towards them. The direction 
from the observer to the celestial pole is parallel to the earth’s axis; for an observer at the North 
Pole of the earth it is therefore perpendicular to the horizon and for an observer on the earth’s 


12.3. Applications of spherical trigonometry 277 


equator it is horizontal. If one imagines that the tangent plane slides along an earth’s meridian from 
the equator to the North Pole, then the altitude of the ce.estial pole increases steadily from 0° to 
90° and is always equal to the geographical latitude. The altitudes h of a star St are measured on 
the great circles through zenith and nadir, the verticals V, which are perpendicular to the horizon, 
and vary from 0° on the horizon to +90° at the zenith and —90° at the nadir. Measurement of the 
altitude of the celestial pole gives the geographical latitude y of the observer. The intersections of 
the vertical through the celestial pole with the horizon are called the north point N and the south 
point S diametrically opposite to it. If the observer looks towards the north point, then to his right, 
at right-angles to the line of vision, is the east point E and to his left, the west point W. These four 
points are called the cardinal points and their directions are the celestial directions north, south, 
east and west. They can be determined by dropping a perpendicular from the celestial North Pole 
or by determination of the vertical on which an arbitrary fixed star culminates; this bisects the 
angle between two verticals on which the fixed star has the same altitude. 

Together with the altitude A, the azimuth a serves as second coordinate. This is measured at the 
position of the observer as the angle between the meridian plane and the vertical plane of the stars Sr, 
and varies from 0° at the south point to 360°, in the sense west, north, east of the (apparent) daily 
motion of the star. Consequently the azimuth appears also as an arc on the horizon and as an angle 
at the zenith point. In place of the altitude the complementary angle is often measured, the zenith 
distance z of the star from the zenith: A + z = 90°. 


12.3-5 Horizontal system 12.3-6 Equatorial system 


Equatorial systems. Because all stars move on parallel circles about the celestial pole, their distance 
from each of these circles must remain constant. As circle of reference one chooses the great circle 
among them, which is polar to the celestial pole. It is called the celestial equator Q, because it is the 
line of intersection of the plane of the earth’s equator with the celestial sphere. On the sphere it runs 
approximately through the constellation Pisces, the upper one of the three stars in Orion’s belt 
and the star Altair in the constellation Aquila. The equator cuts the horizon at the west point and 
at the east point (Fig.) and is inclined at an angle (90° — ¢) to the horizon H. The altitude of a star 
St above the equator is called its declination 6; it is measured on a great circle called the Aour circle, 
which passes through the celestial poles Py and Ps and is therefore perpendicular to the equator. 
The hour angle t is taken as the second coordinate. This is the angle between this circle and the 
celestial meridian m on which the star culminates. This meridian passes through the south and north 
points, the zenith Z and nadir Na, and through the celestial poles Py and P,. It represents the inter- 
section of the plane of the observer’s meridian with the celestial sphere. The hour angle is measured 
from the meridian in the sense of the apparent daily motion of the stars and takes values from 0° 
to 360°, or from 0" to 24". Thus, the west point W has an hour angle of 90° or 6%. 

This first equatorial or hour angle system is independent of the geographical latitude of the ob- 
server’s position, because the declination is referred to the equator. The zero direction from which 
the hour angle is measured is, however, determined by the meridian of the observer and thus depends 
on his geographical longitude. The hour angle of the same star at the same time is, for example, 
greater for Moscow than for London, since the star culminates about 2 h 29.2 min = 149.2 min 
earlier in Moscow than in London because of the rotation of the earth. Since 24 h corresponds to 


278 12. Spherical trigonometry 


an angle of 360°, the difference between the hour angles is 149.2°/4 = 37.3°. Thus, Moscow lies 
further east than London by 4A = 37.3°. To make the second coordinate in the equatorial system 
independent of the position of the observer, one selects a reference point on the celestial equator. 
This point, denoted by Y, is called the vernal equinox (or first point of Aries). As a point of the equator 
it takes part in the apparent rotation of the celestial sphere. The angle measured from it along the 
equator in the opposite sense to that of the apparent rotation is therefore constant. It is called the 
right ascension «. Right ascension « and declination 6 are the coordinates of the second equatorial 
system (or right ascension system). The approximate position of the vernal equinox is found by 
extending the hour circle from the pole star (Py) through the right-hand end of the W-shaped 
constellation Cassiopeia to meet the celestial equator. 


The relations between the horizontal and the hour angle systems. If one combines the two astronomical 
systems (Fig.), the horizon and the equator intersect at the east and west points. Through the star 
St pass the hour circle and the vertical. The path of the star runs parallel to the equator; it reaches 
its upper and lower culmination points at C and /C, respectively. A is the point at which it ascends, 
D is the point at which it descends. The altitude of the celestial pole Py above the horizontal plane 


is the geographical latitude of the position O of the observer; NP, = y. The figure arises as the 
orthogonal projection of the figure Equatorial systems on the plane of the meridian m through 
N, Py, Z, C,Q and S. The hour circle of the vernal equinox Y is not indicated, but the vertical of 
the star St through zenith Z and nadir Na is shown. The points E and A lie behind the points W 
and D and are therefore not visible. The angles y, (90° — ~) and 6 appear with their true magnitude. 

Culmination altitude. If a star St culminates at C, it reaches its greatest altitude A,,,, and at the 
same time its smallest zenith distance z,,;, at that point. Since the equator Q makes an angle (90° — ¢) 
with the horizon, it follows that g = 6 + Zin and the culmination altitude is given by A,,,, + Zmin 
= 90°, or hmax = 90° — » + 6. Consequently from the observed culmination altitude A,,,, of a 
star one can determine either the latitude y for a known declination 6 or, conversely, the declination 6 
for a known latitude @. 

The nautical triangle. For the general position of the star St the two systems are connected by 
means of the nautical triangle with vertices at the star St, the celestial pole Py and the zenith Z. 


It contains the following elements: the sides StZ = 90° —h (zenith distance), StPy = 90° — 6, 


ZPy = 90° — and the angles at the vertices Z and Py. Both the azimuth a with vertex Z and 
the hour angle t with vertex Py are measured from the meridian m in the sense of the daily rotation 
of the stars. Because the position of St represented in the figure is after its culmination, the angles 
appearing in the triangle have magnitudes <{StZPy = 180° a and <{ZP,St=t. If the portion 
of the celestial sphere lying behind the meridian plane were represented in the figure, then the star 
St would be in a position before culmination, W would be replaced by E and D by A, and the angles 
in the nautical triangle would be given by {StZPy =a _ 180° and {ZPySt = 360° — t. 


The path of the sun. When the 
sun is at the vernal equinox, day 
and night are of equal length; it 
rises at 6 o’clock in the morning at 
the east point, moves across the sky 
approximately along the celestial 
equator, and sets at 6 o’clock in the 
evening at the west point. Its right 
ascension «@ and declination do, 
however, unlike those for all fixed 
stars, are not constant. The right 
ascension increases steadily and the 
declination decreases steadily from 
December 22 to June 22. Because 
of the increasing right ascension 
(Fig.) the sun reaches the celestial 
meridian every day later than the 
vernal equinox. In the course of a 
year this time delay grows to one 
full day. Whereas the vernal equinox 
and all the fixed stars culminate 366 
times, the sun culminates only 365 
times. Because of the increasing 

declination of the sun, its ascending 
Na point A and descending point D 
12.3-7 Horizontal and first equatorial systems shift to the north, to A; and D,. 


12.3. Applications of spherical trigonometry 279 


The days become longer until the summer sol- 
stice. The sun then has its greatest declination of 
6 = 23°26’ (circle of rotation of Cancer). After 
this declination decreases, is zero at the autum- 
nal equinox, at the winter solstice is —23°26’ 
(circle of rotation of Capricorn), and at the ver- 
nal equinox it is again zero. Altogether the appa- 
rent path of the sun in the sky is not a circle, as 
for the other fixed stars, but a spiral of 365 wind- 
ings described twice, occupying a zone of width 
2 - 23°26’. While each fixed star, almost without 
exception, has the same neighbouring stars 
throughout the course of a year (with noticeable 
deviations in only a few cases), the sun wanders 
through 13 constellations, which were reduced 
to 12 on account of the duodecimal system. 
These constellations lie on a great circle in the 
neighbourhood of the apparent yearly path of 
the sun, called the ecliptic. The constellations = 
are Aries, the Ram 7Y; Taurus, the Bull vy; 7 . . 

Gemini, the Twins 1; Cancer, the Crab @: Leo, pie Apparent motion of the sun in the course of 
the Lion 2; Virgo, the Virgin 1p; Libra, the 

Scales =; Scorpio, the Scorpion m; Sagittarius, the Archer 7; Capricornus, the Goat %; 
Aquarius, the Watercarrier =, and Pisces, the Fishes . The ecliptic cuts the equator at an 
angle of ¢ = 23°26’, at the vernal equinox and its diametrically opposite point, the autumnal equinox. 


KA are 


8) Was 
; pa 
Water carrier s | Fishes 


he ff i Pegasus | 
by Capricorn 7% Egg OE me 
r F ‘ oe a ihe 
=f / wy v« _-Andromeda___ Kam ff 
7 é i ai ~~. “SPléiades 
/ fetes {Dolphin | nebula ~~~. Pei 
/ | Altair / / ener Cassiopeia 
~ } Eagle Rigel 
i : Me 
be +, ; f ‘ \- 
| Archer lyre i Orion nebiils : 
i i i : = 
le | . 0 ion 
H | : | 
i 
Great dod 


ce 
8 
a] 
zm. 


Sirius \ 


a F 2 | 

— " MY Great bear 

Northern te spire! | abz 6 

crown ‘~" Hunting dogs | gg s 
/ ~Herdsman-..circ 


4 
= ue —— 1 
eT Tor a ' 
é 


STs Se) leo 
a = 
ee oe 


ky, _ ee. 12.3-9 The northern sky 
ja rr 


280 12. Spherical trigonometry 


This apparent motion of the sun along the ecliptic is the consequence of the motion of the earth 
about the sun. The 12 constellations lie in the plane of the earth’s orbit about the sun. A coordinate 
system with the ecliptic as polar circle has the plane of the earth’s orbit as reference plane. The star 
clusters of the Milky Way lie on a new great circle that forms the reference plane of the galactic 
system. It is the most suitable coordinate system to describe the distribution of the stars of the 
Milky Way (Fig.). 


The calculation of time. The measurement of time intervals requires clocks that are controlled 
and calibrated by processes that are as nearly as possible constant, mostly periodic. The rotation 
of the earth about its axis has proved to be very uniform; a fixed star or the first point of Aries Y 
on its apparent path on the celestial equator can serve as the pointer of a very accurate clock. How- 
ever, observations with quartz and atomic clocks have shown that this rotation is not completely 
uniform. The length of a day varies because of tidal friction and changes irregularly through mass 
displacements and other processes inside the earth, as well as through meteorological processes on 
its surface. 

The calculation of time is based, by international agreement, on the duration of the tropical 
rotation of the earth about the sun. The tropical year denotes the time between two successive 
passages of the sun through the first point of Aries. However, this period is also variable, but the 
variation is very small (a few seconds in 1000 years) and its magnitude is known. By choice of a 
definite period that is valid for a given point in time, a definite tropical year is chosen as a normal 
year. The time based on this is for purposes of calculation absolutely uniform and is called Ephemeris 
or Newtonian time, because it was used in astronomy for the calculation of the coordinates of the 
heavenly bodies called the Ephemerides. Accordingly the second, s, is fixed as the 31556 925.974 7th 
part of the tropical year for 1900, January 0, 12 o’clock Ephemeris time; according to the calendar 
1900, January 0 is 31. 12. 1899. 


Sidereal time. The time interval between two successive culminations of the first point of Aries ~ 
is the sidereal day. It is subdivided into 24 h* (sidereal hours) each of 60 min* (sidereal minutes), 
each of these of 60 s* (sidereal seconds). The sidereal day begins with the culmination of the vernal 
equinox. The hour angle of the vernal equinox, expressed in time units and denoted by t.,., is the 


sidereal time. It is the same for all places on the same earth meridian (the local sidereal time), and 
is greater for places further east and smaller for places further west. From the local sidereal times 
t, and f2 of two places at the same moment, the difference in longitude 4A = (A, — 4,) of two places 
can be calculated. A sidereal time difference 4t = (t, — t,) of 24h* corresponds to a longitude 
difference of 4A = 360°, so that one sidereal hour corresponds to 15°, one sidereal minute to 15’ 
and one sidereal second to 15’’. Conversely, a longitude difference of one degree corresponds to 
24 h*/360 = 1 h*/15 = 4 min*. Thus, when the vernal equinox culminates in New York 
(A, = 73.5 W), the sidereal time in Rome (A, = 12.3E) is already (73.5 + 12.3): 4 min* 
= 85.8 - 4 min* = 343.2 min* = 5 h* 43.2 min*. 


fost <— <> —> West The solar day. Because the life of man is based to 
) a large extent on the course of the sun, not Only the 
vernal equinox, but also the sun is used as a time poin- 
ter. For this purpose it has essential disadvantages. 
Whereas it remains fixed relative to the equator, the 
sun’s annual course is round the ecliptic, and its 
speed is not constant because of the non-uniform 
motion of the earth on its Kepler ellipse around the 
sun. For this reason, in addition to the true sun, a 
fictitious body called the mean sun has been intro- 
duced, whose right ascension «,, increases uniformly 
from 0° to 360° in the course of a year. The hour 
angle ¢,, of this mean sun determines a mean solar time 
or, simply, mean time, in contrast to the hour angle 
of the true sun, which determines sidereal time. The 
difference between the two is called the equation of 
time (E.T.): thus, E.T. = t, — t,. The figure shows 
the position of the true sun when the mean sun cul- 
minates. For example, if the equation of time is 
negative, so that ¢,, > ¢,, then the mean sun is hur- 
rying ahead of the true sun and thus already cul- 
ym 79m _sm gy 5 430) minates when the true sun is still east of the meridian. 

= a The ratio of the length of a sidereal day to that of 
12.3-10 Equation of time, t, — fm = E.T.,in 4 mean solar day can be obtained from the fact that 
the course of a year; © mean sun the tropical year contains 366.2422 sidereal days, but 


> 


13 
fy 
= 
S 
= 
et 
o 
3 
2 
_ 
S 
e 
= 
= 
= 
= 
— 
8s 
= 
< 
< 
i 
= 


12.3. Applications of spherical trigonometry 281 


365.2422 mean solar days. One obtains: 24h mean solar time = 24 h* 3 min* 56.55536s*; 
24 h* = 23 h 56 min 4.09058 s mean solar time; 1h mean solar time = 1.002737909 h*; 
1 h* = 0.997269 567 h mean solar time. 


Time zones. Of course, both true and mean time are local times, and only places lying on the same 
earth meridian have the same local time. This fact of nature, so inconvenient for modern traffic, 
is made tolerable by dividing the earth’s surface into zones bounded by meridians of longitude at 
intervals 15° apart, and using the local mean time appropriate to the central meridian at all places 
within a zone. The local mean time of the meridian of Greenwich, A = 0, is called Universal Time 
or Greenwich Mean Time (G.M.T.); that of the meridian A = —15° or 4 = 15° E is Mid-European 
Time (M.E.T.). 


Example I: On the morning of November 18 a ship lies at latitude g = 54°57’ N. The altitude 
of the sun is observed to be A, = 9°15’. The ship's chronometer gives G.M.T. = 8"58™!"205, the 
rag a 6, = —19°12’ and the equation of time +14 min 50s. On which meridian does 
the ip lie? - ae 

In the nautical triangle zenith Z — pole Py — sun S the three sides are known: ZS = 90° — h 
= 80°45’, ZPy = 90° — _— 35°03’, SPy = 90° — 6 = 109°12’. For the difference t’ between 
the hour angle ¢ and 360° the half-angle formula gives 


sin’ = | | aoe a So SS nt 90° —h ae : 598950 | 
2 fi sin (90° — ¢) sin (90° — 6 ; ao : 98! 

fy oe eae ‘es a ( i 90°— m= 35°03- +8.76015 | 

Thus, the observation fixes time as 90° — 6 = 109°12° 8.74965 

30 min 13s before the culmination of the true ———s = 25°00" —9.75913_ | 

sun, that is, at 125 — 2539™!9]3* — 9h29gming7s_ ru ‘ 


— 117°30’ 

But the local mean time is 14 min 50s less, or cat eee 9.97515 
9141575, The difference between this and | 5—[90°—g]= 77°27" 9.01537 
G.M.T. is 9"14™!957s — 8858™!920s — 16min 37s, | S—[90°—d] = 3°18 9.50768 
or (16°37')/4 = 4°9’15”. The ship’s position is | t'/2= 18°46'34” 


A = 4°9/15” E and 54°57’ N. t’ = 37°33'8” = 2h 30 min 13 s| 


Example 2: Ona ship travelling on a calm sea north of the equator the sun’s altitude of hy = 21.7° 
is measured at 18"50™'" G.M.T. The declination of the sun is obtained from nautical almanac 
as 6; = —10.15° and the equation of time as E.T. = +15 min 3 s. After steaming for 15.2 nautical 
miles on the great circle determined by the course N 67.5° W, the ship observes that the sun cul- 
minates at an altitude of 4, = 35° with a declination 6, = —10.21° (Fig.). What are the co- 
ordinates of the two positions of observation? — The culmination altitude 4. of the sun satisfies: 

hmax = hz = 90° — 2 +6, or gy, = 90°+6,—A2, thatis, g, = 44.79°. 


12.3-11 Schematic representation for Example 2 of the 
sky (left) and of the earth’s surface (right), Q equator, 
mo meridian of Greenwich 


The two observation points P,; and P,, together with the North Pole N determine a spherical 
triangle P;NP, on the earth’s surface. With a course angle « = 67.5° at the point P,, the ship 
has travelled a distance P,P, = 15.2 nautical miles = 15.2 - 1.852 km between the observation 
points, and this corresponds to an arcs = ee Nae Be a 0.253°. The side P,N opposite 


the course angle « is 90° — m, = 45.21°; the sine rule, sin AA = sin s sin «/sin (90° — @2), gives 


282 13. Analytic geometry of the plane 


Ad = 0.329°. In the same triangle Napier’s analogy 2a) gives: 

tan [(90° — ,)/2] = tan [(90° — gz — s)/2] sin [((« + AA)/2}/sin [(~ — AA)/2), 
and hence 90° — g, = 45.3°,  p, = 44.7°. 4 

In the nautical triangle ZP,5, of the first observation point the three sides ZS, = 90° — hy, 

ZP, = 90° — g, and P,S, = 90° — 6, are known. By the cosine rule for sides the difference t’ 
between 360° and the hour angle f can be calculated: 

cos f° = (sin A, — sin g, sin 6,)/(cos m, cos d,). 
One obtains r’ = 45.13°= 3.01 h = 3h 0 min 36s. At the first observation it was 12" — 3"0™!"36* 
= 8"59™!"94* true local time, or 8"44™!"21* local mean time, since ¢,, = ¢, — E.T. Relative to 
the local mean time of Greenwich the time difference is 18"50™'" — 8"44™'"21* = 10h 05 min 39s, 
or 10.094 h. Consequently the difference in longitude is 10.094 15° = 151.41°. Thus, Greenwich 
lies east of P, and the longitude of P, isA,; = 151.41° W and that of P; isA,; = A, + 4A = 151.74”. 


13. Analytic geometry of the plane 


13.1. Plane coordinate systems ......... 282 Circle and line ..... 00. ccc cee 300 
Parallel coordinate systems ........ 283 TWO CUCIES: 090554 24.5555445%2468 302 
Polar coordinates .........0..0.4. 284 , 

ee tal eee: 13.5.. “ENE CONICS 6.446% se4 ee cnudnewe ees 302 

res ae 2 rom one coordinate system 284 Conics as intersections of a circular 
eee cone with planes ................. 302 
13.2. Point and line................... 286 Equations of the parabola ......... 304 
Segment and ratio of division ...... 286 Equations of the ellipse ........... 305 
Equations ofa line ...........044. 287 Equations of the hyperbola ........ 307 
Incidence of point and line ........ 292 Conic and line ....... ccc cece 309 
13.3. Several lines ..........-ce0ceeee- 292 Normal Gnd Polar Of G COME was ” 
Point and angle of intersection ..... 293 Two conics ........ ee hie 313 
Triangle and polygon 295 Common vertex equation of the conics 314 
——ee Polar equations of the conics ...... 316 

13.4. Thecircle ................20000- 299 Discussion of the general equation of 
Equations ofa circle ............. 299 the second degree .............000. 318 


The main idea of analytic geometry is that geometric investigations can be carried out by means 
of algebraic calculations. This method has proved extraordinarily fruitful. The fusion of geometric 
and algebraic thinking, together with functional thinking, provides an important help to man’s 
understanding of the exploration and comprehension of objective reality. At the same time the 
method is particularly attractive mathematically and gives rise to important elements in the training 
of the mind. The birth of the method of analytic geometry, and the consequent growth of the methods 
of the differential and integral calculus, characterize the transition to modern mathematics. The 
year of birth can be taken to be 1637, when DESCARTES (1596-1650) published his Discours 
de la Méthode anonymously, to avoid a dispute with the church. In this work, which is also sig- 
nificant for the history of philosophy, the third part, entitled La Géomeétrie, systematically expounds 
the fundamental principle of analytic geometry. Shortly before, FERMAT (1601-1665) had also 
worked out the method of analytic geometry, but his treatise Ad locos planos et solidos isagoge 
(Introduction to planar and spatial geometric loci) was not published until 1679. Since the ‘Geo- 
metry’ of Descartes had also the better notation, the development of the method of analytic geometry 
is usually attributed to Descartes. Its present form was, however, developed a long time after Des- 
cartes, particularly by EuLer (1707-1783). For example, DESCARTES did not use two axes, and 
only since the time of Euler, to whom a large part of the modern notation is due, have far- 
reaching conclusions been drawn from the equations of geometric loci, while DESCARTES and 
FERMAT generally regarded their investigations as ending when the equation had been set up. 


13.1. Plane coordinate systems 


The fusion of geometric and algebraic thinking is attained by regarding geometric figures as 
sets of points and by assigning numerical quantities to each point, which distinguish it from other 
points. A curve or a line is then the carrier of a totality of points whose numerical quantities satisfy 


13.1. Plane coordinate systems 283 


certain relations, which are called the equations of the figure, for example, the equation of an ellipse 
or a line. The graph of a linear equation in two variables is always a line, and that of a quadratic 
equation is a conic. The foundation of this construction of analytic geometry is the correspondence 
between points and numbers, which must be one-to-one. On a line, or more generally a curve, one 
number is sufficient to fix a point uniquely, on a plane or a surface a number pair, in space a number 
triple; conversely, a point on a curve uniquely determines one number, on a surface a number pair 
and in space a number triple. These numbers are called coordinates. They can be obtained in dif- 
ferent ways; coordinate systems are the means of fixing them. 


The number line. On a line the position of any point P is uniquely determined if a zero point O 


and a unit segment u = O1 are given on it. The integral multiples of the unit segment are obtained 
by repeatedly laying off u either from O beyond 1 in the positive direction or from 1 beyond O in 
the negative direction (Fig.). The end-points of the multiples correspond to the whole numbers, 
positive or negative. The point P is either an end-point, or it lies between two of the end-points, 
say n and n+ 1; there is always a real number x such that x times u is the distance |OP| of the point 


P from the origin O. One has nx x<n-+1 for positive x, and —n’ >x >W—n’—1 for 
negative x. The number x is the coor- 


dinate of the point P. Conversely, any negative direction positive direction 

real number x uniquely determines a point ——<<$ ——————— 

P of the number line by means of the p 0 p 
equation m(OP) = xu, where m(OP) ——e-—-___1__1__L_LL_____4__ 
=|OP| if x >0 and m(OP)=— |OP| wi -n SG 2 =I PT bees el 
ifx<0. 13.1-1 The number line 


Parallel coordinate systems 


Oblique parallel coordinates. To fix the position of a point in a plane two non-parallel number 


lines, with origins O and O’ and unit segments u = O1 and u’ = O’1’, are needed, because the plane 
has two dimensions. The lines are always arranged so that their zero points coincide, O = O’; 
they are called the axes of the coordinate system, and are usually called the x-axis, or axis of 
abscissae, and the y-axis, or axis of ordinates (latin abscindere, to cut off, ordinare, to order). If the 
axes enclose an angle « < 180°, then in a right-handed system the notation for the axes is chosen so that 
a rotation of the + x-axis in the mathematically positive sense (anticlockwise) through the angle « leads 
to the + y-axis; in a left-handed system the opposite sense of rotation holds. The plane is divided by the 
coordinate axes into four regions. These quadrants are numbered I, II, III and IV, in the same sense of 
rotation as that of the coordinate system. If a point P lies in one of these quadrants, then a line can be 
drawn through it parallel to each coordinate axis, meeting the other axis in one point, the x-axis in P’ 
and the y-axis in P” (Fig.). The coordinates x and y of these points on the number lines are the 
coordinates of the point P. Different points lead to different number pairs (x, y). Conversely, for 
any number pair (a, 5) there are two points P,, P, on the coordinate axes such that m(OP,) = au 
and m(OP,) = bu’. The lines through these points P, , P, parallel to the other coordinate axes intersect 
in one point P, whose coordinates are a and b. Even if the point P lies on one of the coordinate axes, 
a number pair is necessary to determine its position as a point of the plane. It then coincides with 
P’ or P’”’, while the point on the other axis coincides with the origin, and so its coordinate is zero. 
Points on the axis of abscissae have coordinates (x, 0), and points on the axis of ordinates have 
coordinates (0, y). In the number pair that characterizes the point P, the x-coordinate or abscissa 
always appears in the first place, and the y-coordinate or ordinate in the second place. To the origin 
there correspond the coordinates (0, 0). 

In each quadrant the coordinates have definite signs. For 
the given numbering of the quadrants, the corresponding 
signs are shown in the adjoining table. 


Point lies 
in quadrant 


Rectangular parallel coordinates, Cartesian coordinates. 
In a Cartesian coordinate system the coordinate axes are 
perpendicular to one another, and the same unit of length is 
13.1-2 Oblique coordinate system chosen on the two axes. Also, the two parallels through a 


284 13. Analytic geometry of the plane 


point P by means of which the corresponding coordinates are found are perpendicular to one 
another and to the coordinate axes. This rectangular coordinate system is used in the majority of 
cases, and as a rule a right-handed system, but in surveying a left-handed system is used (see 
Chapter 11.). 


Example: In the figure the point P, has the coordinates x, = +2 and »y, = :+3. If a point 
P, ts to be drawn with the coordinates x, = —3/2 and y, = +-5/4, it can only have the given 
position. The origin has the coordinates (0, 0). 


13.1-3 Rectangular 
parallel coordinates 


13.1-4 Polar coordinates of the point P: g = 45° and e = 4 


Polar coordinates 


The polar coordinate system. A polar coordinate system is determined by a fixed point O, the 
origin or pole, and a zero direction or axis through it, on which positive lengths can be laid off and 
measured, as on a number line. An arbitrary point P of the plane can then be fixed firstly by the 
angle » through which the axis must be rotated in the mathematically positive sense so as to pass 
through P, and secondly by the positive distance @ of the point P from the pole, measured along the 
number line (Fig.). The angle @ is called the argument, phase, or amplitude; it can take values from 
0° up to 360°; the length |OP| = @ is called the radius; it can only take non-negative values. For the 
point O itself, @ = 0 and ¢ is indeterminate. 


Changing from one coordinate system to another 


The same geometric figure, say a circle, can be described in two different coordinate systems 
C, and C,, for example, in a Cartesian system and a polar coordinate system. For the same geometric 
properties two equations f;(x, y) = 0 and f,(&, 7) = 0 are found. Instead of deriving each of the 
two functions from the geometric data, one can calculate one function from the other by means 
of the properties of the coordinate systems and their relative position. One then talks of a trans- 
formation from one system to the other. The equations of the transformation must obviously state 
how to calculate the coordinates (€,7) of a point in C, from the coordinates (x, y) of the same 
point in C,, and conversely. If the equations of transformation are x = t,(&,), y = t2(&, 7), and 


their inverses £ = 1,(x, y), 7 = T2(x, y), then the equations f;(x, y) = 0 and f2(é, 7) = 0 that describe 
the geometric figure are transformed into one another. 


Transformation from polar coordinates to Cartesian and vice versa. For simplicity it may be assumed 
that the pole of the polar coordinate system coincides with the origin of the Cartesian coordinate 
system, and its axis with the x-axis. Then if a point P has polar coordinates (0, y) and Cartesian 
coordinates (x, y), the trigonometric relations give x = @ cos y, y = @ sin g, and it can be seen from 
the unit circle, in particular, that all possible combinations of signs of x and y can be obtained in 
the various quadrants by making ¢ take all values from zero to 
2x (Fig.). 


13.1-5 Relation between Cartesian and polar coordinates 


13.1. Plane coordinate systems 285 


Example J: If P, has rectangular parallel coordinates (3, 4), then 0, = (3? + 47) = 25 = §; 
ceo 3/5 = 0.6, sin p, = 4/5 = 0.8; from the trigonometric tables, p, = 53.13°. Hence P, 
has polar coordinates 0; = 5 and g, = 53.13”. 

Example 2: P, has polar coordinates 02 = 3, 2 = 120°. Then the rectangular parallel coordi- 
nates of P, are x; = 3 cos 120°, y. = 3 sin 120°, and so from the values of the trigonometric 
tables, x2 = —3/2, y2 = (3/2) V3. . 

Example 3: In polar coordinates the equation of a circle with centre at the pole and radius r 
is given by ep = r, 0<@ < 2a. Without any further geometric considerations, the equation of 
the circle in Cartesian coordinates can be obtained by substitution from the equations of trans- 


formation: ¢. = V(x? + »?) = ror x? + y? =r’. 

Parallel displacement of a system of rectangular parallel coordinates. Two different Cartesian 
coordinate systems C, with coordinates x and y and C, with coordinates € and 7 are related in 
such a way that corresponding axes are parallel to one another and the origin O2 of C, has co- 
ordinates (a, b) in C, (Fig.). The same point P then has coordinates (x, y) in C, and (€,7) in C2, 
where x =a+é&,y=b+y,0r§=x—a,n=y-— b. 


These transformation formulae al- 
ways hold, irrespective of the 
quadrant in which the origin of 
the new system happens to lie; for 
example, if a and b are both po- 
sitive, the displacement is upwards 


13.1-6 Two parallel rectangular 13.1-7 Transformation of 4nd to the right; if a and 6 
coordinate systems displaced the equation of a line are both negative, it is downwards 
relative to one another and to the left. 


Example 1: The (x, y)-system is to be transformed so that the origin of the (¢, 7)-system parallel 
to it is at the point (4, —2.5), that is,a@ = 4,b = —2.5. The transformation equations are x = 4+ &, 
y= —2.5 + 79. 

Example 2: In the (x, y)-coordinate system there is a curve (a straight line) whose equation 
is y= 2x — 1.2 (Fig.). If one puts x = 0, one obtains its point of intersection with the y-axis, 
whose coordinates are (0, —1.2). Let this point be the origin of a (£, 7)-coordinate system, whose 
axes are parallel to those of the (x, y)-system. Since a = 0 and 6 = —1.2, the transformation 
equations are x = §, y= — 1.2. They hold for every 
point of the plane. In the (&, 7)-system the curve (line) 
therefore has the equation 7 — 1.2 = 2& — 1.2, or 9 = 28. 

It can be seen that in this case the form of the equation 
is simplified by the transformation. 


Rotation of a system of rectangular parallel coordinates. 
Suppose that the (x, y)-system of rectangular parallel co- 
ordinates is rotated (keeping the origin fixed) in the 
mathematically positive sense through an angle y into a 
(£, n)-system. Let a point P have the coordinates 
(x, ¥) in the old system, and (&,7) in the new system 
(Fig.). : 

For any angle y the projection of the §-coordinate on 
the x-axis has the value OC =€cosy. The y-axis is 
inclined to the x-axis at an angle y + 2/2, and so the 
projection of the -coordinate on the x-axis is CA 
= 7 cos(y + 2/2) = —nsiny, by a theorem of trigono- 
metry. Hence, in the sense of vector addition: x = OA 
= OC + CA = cosy — 9 sin y. 13.1-8 Rotation of the coordinate system 


286 13. Analytic geometry of the plane 


The inclination of the §-axis to the y-axis is —(x/2 — wp) for any angle y; that of the 7-axis to 
the y-axis is y. Hence, for the projection of the ¢-coordinate on the y-axis, OD = & cos (y — 2/2) 
= € sin y and for the projection of the 7-coordinate on the y-axis, DB = n cos y; hence the y-co- 
ordinate satisfies the transformation equation y= OB = OD + DB= ésiny + n cosy. 


The formulae for € and 7 are obtained by rotating the (€, 7)-system through an angle —y. 


Example: What are the coordinates of the point P(2, 4) in the coordinate system resulting from 
a rotation through 30°? — The old coordinates are x = 2, y = 4; since sin y = 1/2, cosy = */2 3, 
=? x 1/, y3+4*x 1), =2+Yy73, n=-—2*x 1, +4*x A), V3=-—1+2y3. 
Hint. By parallel displacement of the coordinate system the absolute term of the equation of a 
curve can be eliminated, as in the example above. By means of a rotation it is always possible to 
remove the mixed term xy from an equation that is quadratic in the variables x and y (see Dis- 
cussion of the general equation of the second degree). Here one is using the transformation to principal 
axes. Suppose, for example, that the equation x? + xy + y? — 3 = Ois given. Then, if the equations 
of the rotation through 45°, 
x = Ecos 45° — nsin 45° = (€ — n)- 1/2 y2, 
y = sin 45° + n cos 45° = (€ + 9) - 4/2 2 
are substituted into the equation, the new equation is 


1 14(€2 — 2&n + 7) + 1/2(€? — 7?) + 1/2(6? + 26n + 97) —3=0, or 36%+77—6=0. 


13.2. Point and line 


Segment and ratio of division 


Length of a segment. The length of a segment, that is, the 
distance between its two end-points, is measured in pure geome- 
try by aruler, but in analytic geometry it is calculated from the 
coordinates of its end-points. If the end-points P, and P, of the 
segment (Fig.) have rectangular parallel coordinates P;(x, , y;) 
and P2(x2, y2), then the length of the segment P,P, is found 
by means of the theorem of Pythagoras. 


13.2-1 Distance between two points. 
Length of a segment 


vee: 4 Given P,(1,8), P2(4,2): |P,P2| = Vi(4 — 1)? + (2 — 8)?] = y[3? + (—6)?] 
= V45 = 6.71. 
2. Given P3(—3, —2), P4(—6, —1): |P3P4| = V{(—6 + 3)? + (—1 + 2)?] = Y10 & 3.16. 
3. By how much is the direct route from P, to P, shorter than the detour from P, through 
P, and P; to P,? — One finds that |P,P,| = 130 = 11.40 and 
|P,P2| + |P2P3| + |PsPs| = 45 + 65 + 10 6.71 + 8.06 + 3.16 = 17.93, 
that is, the difference is 17.93 — 11.40 = 6.53. 


Ratio of division of a segment. If a point P lies on the segment P,P, and does not coincide with 
P,, then P divides the segment P,P, in the ratio P,P: PP, = A; A is called the ratio of division of 
the point P with respect to the segment P, P,; P,P denotes not only the length |P, P| of the segment, 
but also the direction from P, to P, that is, P;P = —PP,. The same holds for PP,. Therefore, 


if P lies between P, and P,, then P,P and PP, have the same direction and hence the same sign, 
and so A is positive. The signs of the two segments depend, as for any number line, on the direction 
chosen as positive on the line, that is, on the orientation of the line. If the orientation of the line 
is changed, then the segments P,P and PP, have reversed, but still equal signs. The orientation of 
the line does not change the ratio of division A and can therefore be ignored. If P lies outside the seg- 


ment P,P,, then P,P and PP, have opposite signs, and so A is negative. 


13.2. Point and line 287 


A more precise investigation shows that each position of the point P on the line can be charac- 
terized by one value of the ratio of division A (Fig.). In fact, it can be seen that J increases monotoni- 


cally as P moves along the segment P,P, from P,, since in A = P,P/PP, the numerator always 
increases and the denominator decreases. For P= P,, A = 0; for the mid-point M of the segment 


P,P Ay = +1; as P approaches the point P, arbitrarily closely, A increases beyond any finite value. 

If P is an outer point of division of the segment P,P, then the difference of the lengths of P,P 
and PP, is always |P,P2|, and it has less influence on the ratio P,P: PP, = A the greater the seg- 
ments P,P and PP, are, that is, if P is sufficiently distant from the segment P,P,, then A differs 
arbitrarily little from —1. It makes no difference whether P moves away in the direction P,P, or 
in the direction P2P,. One says briefly: at the improper or infinitely distant point P of the line the 
ratio of division has the value A = —1. If P approaches the point P, in the direction of P,P2, then 
|PP2| = |PP,| + |P:;P2| > |PP:|, and so the absolute value of the ratio of division is always less than 1, 
that is, A increases from —1 to 0. If P approaches the point P, in the direction of P,P,, then 
|P,P| = |PiP2| + |\P2P| > |P2P|, that is |A| = |P,P|:|PP2| > 1, and so |PP2|— 0 or |A| > oo as 
P- P,; A therefore decreases monotonically from —1 to —co as P moves, as an outer point of 
division, towards P, in the direction P,P,. For P= P2, A is not defined, but A converges to + co 
when P is an inner point and to —co when P is an outer point. 


13.2-2 Value of the 
ratio of division 4 
as P describes a line 


13.2-3 Ratio of 
division and the 
equation of a line 


Equations of a line 


Direction of a line. An oriented line / makes with the + x-axis an angle <((x, J) = 9, that is, the 
-++x-axis moves into the direction of the line by means of a rotation about its point of intersection S 
with the line / in the mathematically positive sense (in a right-handed system, the opposite sense 
in a left-handed system). The line makes an angle <(y, 1) = g — xn/2 with the + y-axis. Suppose 


that two points P, and P2 are given on the line, so that P, P2 is positive. Lines are drawn through 
each of these points parallel to each coordinate axis (Fig.), cutting the axes in P,,, P2, and Pyy, Py. 


The projections of the segment P,P on the axes are then given for any angle g by the formulae 
Pi,P2, = P,P, cosg and P,,P2, = P,P2 cos (y — 2/2) = P,P, sing. If (x,, y,) and (x2, y2) are 
the coordinates of P, and P,, then 
X, + PiyP2,=%2, %2—%, = Pi,P2, and yy + PiyP2y=)2, Yo2—-V1 = P,yP2,; 
since |P1P2| = +Vi(e2 — x1)? + (2 — 1)? ), 
X2— *1 y2—)1 
ViGe2 — xy)? + (¥2 — y1)?)’ Vie. — x1)? + G2 — 1)?) © 
The angle is thus determined by the coordinates of the points P, and P,; it can take values between 
0 and 27. However, in all cases where the orientation of the line / need not be taken into account, 
it is sufficient to determine the angle g from the value of its tangent: 
m = tang = (y2 — y)/(x2 — 1), yp = tan™ [(y2 — y1)/(x2 — xy)). 

It is best to take the principal value of the inverse tangent function, that is, the value of ¢ in the interval 
—2/2 << » + 2/2. The value m is called the gradient or slope of the line /. 

Equation of a line. If a point P divides the segment P,P, in the ratio A = P,P: PP2, then the 
equations found above for the segments P,P and PP, hold, that is, 
x—x,=P,Pcosy and y—y,=P,Psing, x2—x=PP,cosg and y, —y = PP2 sing. 


cos g = sing = 


288 13. Analytic geometry of the plane 


Hence, for the ratio of division, 
A= P,P: PP, = (x — x;)/cos@: (x2 — x)/cos y = (x — x1): (x2 — x) 

or A= P,P: PP, = (y — y;)/sing : (v2 — y)/sing = (y — 1): 2 — ¥). 

If A takes all values between —oco and -++oo, then the point P describes the line /; if the coordinates 

(x, y) of a point P satisfy the equation (x — x,): (x2 — x) = (y — y1): (v2 — y), then the point P 

lies on the line. The coordinates (x, y) are often called the current coordinates. By corresponding 

addition, the equation a:b=c:d can be put into the form a:(a+ b) = c:(c +d); hence the 

equation of the line is (x — x,): (x2 — x1) = (vy — 1): (v2 — 1). By interchanging the inner 

terms, one obtains the two point =-— 

form of the equation. 


This emphasizes that two points P, and P,com- 

pletely determine the line in the coordinate 

system. In the point-direction form, a point P, and the gradient m = (y2 — y1): (x2 —x,) = 
(y — y1):(« — x;) determine the position of the line. 

The right-hand side (y2 — y,)/(x2 — x;) of the two point form corresponds to the meaning of 
tan m and can take all values of this trigonometric function, according to the relative position of 
P, and P,. Of particular interest are the special cases of the quotient for (y2 — ¥1) = 0 and 
(x2 — x,) = 0. In the first case, since tan g = 0, the line determined by the points P, and P2 is 
parallel to the x-axis, and y = 0° or gm = 180°. From the line equation it follows that (y — y,)/ 
(x — x,;) = 0, y— y,; = 0, y= y1; that is, for any value of the x-coordinate of the point P that 
describes the line, its y-coordinate has the constant value y = y,. From the line equation it again 
follows that the line is parallel to the x-axis. In the second case it follows from the equation tan gy = oo 
that » = 90° or y = 270° and similarly from the identically transformed equation (x — x,)/(y — y;) 
= (x2 — x1)/(y2 — ¥1) = O or x — x, = 0; for any y-coordinate of the current point P its x-co- 
ordinate must always have the constant value x = x,; that is, the line runs parallel to the y-axis. For 
the x- and y-axes themselves, since y; = 0 and x, = 0, respectively, the equations are y = 0 and 
x= 0. 

From the equations A = (x — x,)/(x2 — x) and A= (y — y1)/(y2 — y) the coordinates (x, y) 
of the point P that divides the segment P, P2 in the ratio A can be calculated; for example, Ax. — Ax 


= x — xX, or x = (x, + Ax2)/(1 + A). For the mid-point M of the segment P,P, one has 


x= (x1 + X2)/2, : 
y = (1 + y2)/2, since 
A= +1. 


Example 1: Direction angle @ of the segment P,P». 
a) Given P,(2, 3) and P;(7, 8): 
cos p = (7 — 2)/V{(7 — 2)? + (8 — 3)?] = 5/VI5? + ]= 1/V2; 
sing = (8 — 3)/(5 72) = 1/)/2. Direction angle g = 45°. 
b) Given P,;(—1, —2) and P;(0, 8): 
cosy = (0+ 1//[(O+ 1)*7+ (8 as 2 =1/V101; sing = (8 + 2)//101 = 10/V101; 
tang = 10. Direction angle p = 
c) Given P,(2, —3) and P2(—3, a a 
cos p = (—3 — 2)/y{(—3 — 2)? + (5 + 3)?] = —5/y/89; 
sing = 8///89; tang = —8/5 = —1.6. Second quadrant: gp = 180° — 58° = 122°. 
Example 2: Find the coordinates of the point 7 that divides the segment joining P,(3, —2) 
and P3(—5, 4) in such a way that P,T: TP; 
=D: 3; ~ Since = 2/,, it follows that 


xy + Ax = 3+ */3(—5) —_ 9— 10 
ere OE 2S ore ee 5 
=— "ls; 
ag ES AL elon J wa? ie Abs 

1+A 1+ 2/3 5 


= fF T(- ‘Is, 7/5). 


13.2-4 The line through P, (—4, —2 d ah Mets . 
P, (5, as ene = ie iat is bisected (A = -+-1) at the point 

Example 3: Find the equation of the line joining the points P,(—4, —2) and P,(5, —4) (Fig.). - 
By substituting the given values x; = —4, y; = —2, x2 = 5, y2 = —4, one obtains the equation 


13.2. Point and line 289 


(y + 2)/(x + 4) = (—4 + 2)/(5 + 4), which can be simplified to y = —(2/9) x — 26/9. The 

tm has the value m = —2/9 = —0.2222... = tang. The value of @ is g, = —12.53° or 

pg = 180° — 12.53° = 167.47°; the principal value wie —12.53°, 53°. A point Ps on the lin line at a a 

distance 3|P, P| from P2 in the direction P,P, divides the segment P,P; in the ratio A = P,P3/P3P, 
= ie \P, P2|) = —4/3. It therefore has the coordinates 


= [x — */sx2]/(1 — */,) = [—4 — */3 - 5]/(—1/3) = 12 + 20 = 32 
and = [y, — */sy2]/1 — 4/3) = [—2 — */3(—4)](—*/3) = —16 + 6 = —10. 


pee To find the equation of the line that passes through the point P,(3, 4) and has the 
direction angle x = 60°. Since x, = 3, », = 4, tana = m = tan 60° = //3, the equation of the 
line is y — 4= y3(x — 3)ory=x V3 +4—3 3. 


Cartesian normal form of the equation of a line. In the last example the equation of the line was 
simplified to the form y = ax + b. This can be done generally for the equation of any line, provided 
that it is not parallel to the y-axis. From the point-direction form, for example, one obtains 
y— Vy = M(x — x1) = mx — mx,, 


or y=mx-+tc, where c= yy — mx,. 


As was shown above, lines parallel to the y-axis have the equation x = x,. Since m = tan Q, 
mx, is obviously the difference of the ordinates of the given point P, and the point of intersection S 
of the line with the y-axis; c = (y; — mx,) is therefore the ordinate of S. This is confirmed by 
putting x = 0 into the normal form, which gives y, = c. 

If one takes the line y = mx, and gives x the value 1 (Fig.), then y = m; the line y = mx + c¢ 
is then obtained from this line by a parallel displacement of c in the direction of the y-axis. Hence, 
if one lays off a length m from the point (1, 0) along the line parallel to the y-axis at a distance +1, 
then the line from the origin to the end-point of this segment is the line y = mx. The line parallel 
: av ye the end-point of the segment of length c from the origin along the y-axis is the required 
ine ig 


13.2-5 Derivation of the Cartesian nor- 
mal form from the point-direction form 


13.2-6 Three examples of the normal form y = mx +c 


Intercept form of the equation of a line. A line that does 
not pass through the origin and is not parallel to either co- 
ordinate axis can be fixed in the coordinate system if the 
intercepts a and 5 that it cuts on the axes are given. If it 
cuts the axes in the points P,(a, 0) and P;(0, 5), then the two- 
point form of the equation (» — 0)/(x — a) = (6 — 0)/(0— a) 
can be expressed as y/b = [x/(—a) + 1] or x/a+ y/b= 1 
(Fig.). 


Examples: 1. The intercepts are a = 4, b = —2. The nos ' | 
Simao aceon in BBB ee 

2. The general intercept equation can also be put into the normal form. From the equation 
x/a + y/b = 1 one obtains y = —(b/a) x + 6, that is, m = —b/a, c = b. 

3. At what points does aka Aer —4x/3 + 8 cut the axes? — From the formula of Example 2 
it follows immediately that b = 8, and therefore a = 6. 


290 13. Analytic geometry of the plane 


The Hessian normal form of the equation of a line. This form of the equation is named after Otto 
Hesse (1811-1874). The (x, y)-plane is divided by an oriented line | into two half-planes, of which 
the one that lies to the left of / as it is described in the sense of the orientation is called positive. 
There is then a line 2 through the origin normal to I, oriented so that the angle (/, 7), in the sense of 
rotation of the coordinate system, is +90° (Fig.). By means of this normal 7, the distance p of the 
line l from O can be determined. If the perpendicular from the origin O to the line / cuts it in L, 
then OL = p. In the figure, this distance p is positive; if O were to lie in the positive half-plane, 
then the distance OL would be negative; if O and L coincide, then p has the value zero. 

If the normal, and therefore the distance p, makes an angle 9 with the x-axis, <[(x, 2) = g, then 
the angle (x, /) that the line / makes with the x-axis is obtained by rotating the normal through 
—z/2, that is, {(x, ) = L(x, nm) — ile = g — 2/2. The angle (y, m) that the normal makes with 
the y-axis is given by <{(y, 2) = g — 2/2. The angle oa <{(x, m) can take all values between 0 
and 2x. If a point P has the eae coordinates x = OR, y = RP, and the distance ¢ d= QP from 
the line. |, then, there are two vector paths from the origin O to the point P, namely OL + LO + QP 
and OR + RP. Their projections on the normal must be equal in magnitude and sense: 

pt+0+d=xcosy-+ycos(y,n)=xcosy+ysing or d=xcosgy-+ ysing — p. 
For points P in the positive half-plane the distance d 
from the line | is positive, and for points in the negative 
half-plane the distance is negative. 

Points P that lie on the line have the distance d = 0 
from it; the equation of the line is therefore 
—— r+ vein — ode = 9. 


The sign of p in the equation ieeeat on the orienta- 
tion, as described above. The two parallels to / at the 
distances +6 have the equations 


xcosgy-+ ysing —(p+d6)=0. 
For 6 = —p one of the parallels passes through the origin, 
its equation is 

xcosy + ysing=0 
or y = —x cot gy = x tan (py — 2/2) = mx, 
hence assumes the Cartesian normal form. If, however, 13.2-8 Hessj If fth 3 
é> p, then one of the parallels lies on the other side of tion of a line iain ads 
the origin, and p’ = p — 6 takes a negative value. 


Example J: If the line / has the distance p = 3 from the origin and if the direction of the normal 
n is determined by the angle gy = 30° (Fig.), then the equation of the line in the Hessian normal 
form is x cos 30° + y sin 30° — 3 = 0 or x-*/, V3 + y+ 1/2 — 3 = 0; in the Cartesian normal 
form it is y= —x V3 + 6. The distances of the points P,(3, 7) and P,(—1, —3) from the line / 


= LIL atoms tat re 
ens aot a, oo Sim — 0.87 + 4.5) = —5 57, 
The two parallels P2 and p, at the distances 6 = +6 from / have the following equations: 

x 4/, V3 + yy: 3/2 —9=Oand x-3/, V3 +»: 4/2 +3 =0; in the second equation, p has a 
ncputiee value. The parallel with distance p, > 0 and normal n’ = —n has the equation: 
—x-4/, V3 —y-*/2 —3 =0 or x cos 210° + ysin 210° — 3 = 

Example 2: A line A (Fig.) cuts the x- and y-axes in the points P,; = (—5, 0) and P, = (0, +8) 
and makes an ae (x, 4) with the x-axis for which tan (x, 4) = °/s = 1.6, that is, <[(x, h) = 58°. 
Since <{(x, kh) = mp — 2/2, po = 58° + 90° = 148°. In the Hessian normal form x cos 148° 
+ ysin 148° — pes 0, p is obtained by projecting the segment OP, or OP, on the normal n: 

p = OP, cos 148° = (—5) (—sin 58°) = 5 - 0.8480 = 4.24 
or p= OP, cos (148° — 90°) = 8 cos 58° = 8 - 0.5299 mw 4.24. 
The Hessian normal form is therefore —x(0.85) + (0.53) — 4.24 = 0. 
The distance d of the point P,(6, 5) from the line A is given by d= —6 X 0.85 + 5 X 0.53 — 4.24 
= —5,09 + 2.65 — 4.24 = —6.68. This distance is larger than p by p,; = 6.68 — 4.24 = 2.44, 
The parallel A, to A through P, therefore has the equation —x(0.85) + »(0.53) + 2.44 = 0, and 
the parallel A; with mn, = —n has the equation x cos (148° + 180°) + y sin (148° + 180°) — 2.44 
= 0 or x(0.85) — (0.53) — 2.44 = 0. 


13.2. Point and line 291 


13.2-9 Example 1 of a line in Hessian normal form: 13.2-10 Example 2 of a line in Hessian 
y = 30°, p = 3 normal form 


The general form of the equation of a line. The general equation of a line is Ax + By + C= 0, 
where A, B, C are arbitrary real numbers, except that A and B are not both zero. Its graph is 
always a straight line. If either A or B is zero, say A = 0, B + 0, then the equation By + C = 0, 
y = —C/B, represents a line parallel to the x-axis at a distance y = —C/B; if B = 0, A + 0, the 
line is parallel to the y-axis at a distance x = —C/A. If A and B are both non-zero, then the equation 
y = —(A/B)x — (C/B) is the Cartesian normal form with gradient m = —A/B and intercept 
c = —C/B on the y-axis. If C = 0 the line passes through the origin. 

The half-plane that contains only those points P(x, y) whose coordinates give positive values to 
the linear function Ax + By + C =f (x, y) is called positive. The equation y = (8/5) x + 8 con- 
sidered in Example 2 corresponds to the linear function Sy — 8x — 40. For the point P3(6, 5) 
it has the value 25 — 48 — 40 = —63, and so P; lies in the negative half-plane. 

The line is oriented so that it is described in the positive sense when the positive half-plane lies 


to the left of it. If the equation Ax + By + C = 0 is multiplied by TEES’ where € = +1: 
EAx i eBy 4, EC =0 
V(A? + B?) © (A? + B2) — Y(A? + B?) ° 


it is thereby normalized, that is, the sum of the squares of the 
coefficients of x and y is 1: 


eA 2 eB - \ 
(a) = ( a +5} = 
These coefficients can then be interpreted as the values of 
the cosine and sine of an angle @ (Fig.). If one puts 
&A eB 


VAR +B SP aR BNP 
eC Sis 

— ee " , 

(A? + B?) & J=@ -¥ 


then the equation is in the Hessian normal form; only for 
€ =-+1 the distance d of a point P of the positive half-plane 13.2-11 Normalizing the equation 
iS positive. 3y — 2x —4=0 


3 6 250 
+ ET al 
~~ | 


292 13. Analytic geometry of the plane 


Incidence of point and line 


One speaks of the incidence between a point and a line if the point lies on the line or the line passes 
through the point. How can one establish incidence analytically? In the equation of a line, for example, 
y = 2x — 7, x and y are the coordinates of an arbitrary point P(x, y) lying on the line. If P,(4, 1) 
is one of these points, then the equation must be satisfied in particular for x = x,, y = y,. The 
coordinates x, and y,; are said to satisfy the equation of the line y = 2x — 7;in this case, 1 = 2-4 — 7. 
By contrast, the coordinates of the point P2(2, 4) do not satisfy the equation of the line, since 
4+ 2-2 — 7. Obviously these considerations remain correct irrespective of which form is taken 
for the equation of the line. 


A point P, (x,, ¥,) lies on the line if and only 
if its coordinates x, and y, satisfy the equation 
of the line. 


Examples: 1. The point P(2,3) does not lie 
on the line 2x — y/4 + 8 = 0, since 2-2 — 3/4 
+ 8+ 0. 

2. The line x/2 + »/3 — 17 = 0 does not pass 
through the origin, since 0/2 + 0/3 — 17+ 0. 

3. The point P,(57, 88) lies on the line y — 8 
= 2-(x — 17), since 88 — 8 = 2+ (57 — 17). 

4. The line through the points P,(0, 3/2) and 
P,(2, 5/2) has the equation (y — 3/2)/(x — 0) 
== (5/2 — 3/2)/2 or y = x/2 + 3/2. It cuts the 
x-axis in the point S whose ordinate is yo = 0. 
Its abscissa is then x9 = —3. The point 
S(—3,0) is the intersection of the line y 
= x/2 + 3/2 with the x-axis; x» = —3 is the 
zero of the function. 

5. If a point P, with the abscissa x, = 5 is to 
lie on the line y = 2x/3 — 2, then its ordinate y 
must an the value y, = 2x,/3 — 2 = 10/3 
—2= 4/3. 

6. To find the line /, through the point 13.2-12 The lines /, and /, through the point P, 
P,(6, 4) that has the distance d= 3 from the have distances d, and d, from the point P, 
point P2(3, —5). — The line can be constructed 
geometrically by means of the circle of Thales on the segment P,P, as diameter (Fig.). 

Two lines |, and /, are determined in this way; their distance, according to the orientation of 
the perpendicular drawn from the origin, is either positive (d, = +3) or negative (d, = —3). 
At the same time it should be noted that the given distance d must be smaller than the length of 
the segment P, P2 if a solution is to be possible. It is advisable to take the equation of the line in 
the Hessian normal form x cos p + y sing — p = 0, and to determine the three numbers cos ¢, 
sin p oo 4 a the line passes through the point P,(6, 4) and has the distance d = +3 from the 
point ZN) ~~ SI 


6cosp + 4sing — p=0 + 
| 3cosp — Ssing — p= +3 | > > 3 cos + 9 sing = $3 


29 + sin? » = 1 ~__—___—_—_—_— cos» = 71 — 3sing 


1+ 6sing + 9sin? g + sin? g = 1 cosg, = +4/5, cosy, = +1 
iareieuee =0 ¥ Pi = +27/s, $$ P2= F6 
sing, = +3/5, sing; = 0 —— , 
The choice of p, = +2?/s and p2 = +6 fixes the orientation of the two lines /, and /,; putting 
cos p; = +4/5, sing, = —3/5, py = +2*/s and cos g, = +1, sing, = 0, p; = +6, one obtains 
the required equations +4x/5 — 3y/5 — 27/, = 0 and x — 6 = 0, or, in the Cartesian normal 
form, y = 4x/3 — 4 and x = 6. If the coordinates x = 0, y = 0 are substituted into the two 
functions f;(x, y) = —3y/5 + 4x/5 — 27/5 and f2(x, y) = x — 6 one sees that the origin O lies 
in the negative half-plane for both lines. 


13.3. Several lines 


In plane geometry it is well known that the relative position of two lines in a plane can be described 
by means of the concepts parallel and distance, or point of intersection and angle. In analytic 
geometry the corresponding characterization of two lines can be read off from their equations. 


13.3. Several lines 293 


In the Cartesian normal form of two lines y= m,x + c, and y = m,x + c2, their directions 
are characterized by their gradients m,; and m2. The lines are parallel if and only if m, = my. If, 
on the other hand, they are given in the Hessian normal form, x cosy, + y sing, — p; = 0 and 
X COS Y2 + y Sin Y2 — p2 = O, then they are parallel if the coefficients of the linear terms are equal 
to within a common factor x, cosg; = * cos gz and sing, = x sing. Since these coefficients 
can be obtained by normalization from the general linear equations, the same condition holds for 
two parallel lines given in the form A,x+ B,y+ C,=0 and A,x+ B,y+ C,=0: A,B2,—A2B,=0. 

If in a linear equation, say Ax + By + C = 0, x and y are interpreted as coordinates of a given 
point, then the coefficients 4, B, C are parameters, and the equation means that from the set of all 
points (x, y) a subset is picked out: it consists of those points whose coordinates satisfy the con- 
dition given by the ratios A: B: C of the parameters. It has been shown that this subset has a line 
as carrier. Conversely, A, B, C, where A and B not both zero, can be regarded as homogeneous 
coordinates. Then x, y are parameters, which pick out from the set of all lines (A, B, C) or (cos Q, 
sin y, p) the subset of those that have the point (x, y) as carrier. They form a pencil of lines. If the 
relations are given that must hold between the point coordinates of a point P on a line dividing 
the segment between two other points P, and P, of the line in a given ratio, then one can immediately 
find the relations between the line coordinates that must hold if the two lines are parallel. 


Point and angle of intersection 


Determination of the point of intersection of two lines. Required are the coordinates xp and », 
of the point of intersection Po(xo, Yo) of two lines Ayx + Byy + C, = Oand A,x + Bay + C,=0 
given in the general form. Pp as the point of intersection of the two lines must lie on both, henci 
its coordinates x, and yo must satisfy the equa- 
tions of the two lines. The point of intersection yee Vee Bert 
is therefore obtained by solving the system of | 
equations i eae 

A,Xo + Byyo + Cy = 0 

A2Xo + B2yo + C2 = 0], 
which can be done according to the usual rules 
(see Chapter 4.). If the system has a solution 
(xo, Yo), this gives the coordinates of the point 
of intersection. If it has no solution because the 
equations are incompatible, then the lines are 
parallel. If the system has infinitely many solutions, 
because the equations are linearly dependent, then 
the two lines coincide. 


Examples: 1. At which point do the two lines 
—3x + 3y —6=0 and 2x + 3y+9=0 
intersect? — The following scheme shows the 


solution of the system of equations. fepeg 
[ —3x9 + 3y9 —6=0|—5x,+15= 0 ia FEEL off free sau ie EE See freed 
| Haat 


2x9 + 30 + 9=0/+—4% Xees3 See a a ee 
—3(—3) + 3y — 6 =0 
Yo=-—! 
The lines intersect at the point Pp(— 3, — 1) (Fig.). 
2. To find the point of intersection Pp of two lines given in the normal form, say y=—3x-+ 14, 
y = —x — |, one can most conveniently solve the system by equating the right-hand sides. In 
the given example, the lines intersect at Po(7.5, —8.5). 
3. The lines 3x + » — 7 = O and 2x — y — 3 = 0 intersect at the point P,(2, 1). 
4. The lines 2x — 3y+5=0 and 3y — 2x + 2=0 are parallel. For each line the origin 
belongs to the positive half-plane. In the Hessian normal form the system of equations becomes 
2x9 — 3y9 + 5S = ol tis — 3yo/V'13 + 5/713 = 0 
— 2x9 + 3y9 +2 = 0, a | 900/13 + 3y/V13 + 2/V13 = 0 
The lines are at a distance 7/\/13 from one another. 
5. The equations 0.8x + 0.4y — 1.2=0 and 2x + »y— 3=0 represent the same line; the 
first equation is obtained from the second by dividing by 5/2. In the system of equations the two 
equations are dependent on one another. The lines coincide. 


{ 13.3-1 Determination of the point of intersection 
of two lines 


294 13. Analytic geometry of the plane 


6. The lines y = 2x — 8 and y = 2x + 12 are parallel, since m, = 2 = m2. Similarly the lines 
x/4+y/6=1 and x/2+ y/3=1 are parallel, since their Cartesian normal forms are 
y = —3x/2 + 6 and y = —3x/2 + 3. 

SP hao toon Genet al bose Non mrad rae ven Metin) smsicoredle ded ar 
y = 2x — 3. - For the required line a point and the gradient m = 2 are given using the t- 
direction form, one obtains the equation of the line y + 1 = 2(x — 2). my . ag 


The angle of intersection of two lines. The angle y = <{(/,, /2) at which two lines /, and /, intersect 
is obtained most simply from the Hessian normal form; from x cos gy; + y sing; — Pi = 0 and 
X COS M2 + y SiN Y2 — P2 = 0 it follows immediately that y= J(),, 2) = 92 — 91. 

From the Cartesian normal form the angle y is obtained, by the restriction to the principal value 
of the inverse tangent function, only to within an additive constant +27. If y= mx + ¢1 and 
y = m2x + cz are the equations of the lines, then m, = tan a; and m, = or O., where x;= <{(x, 1), 
2= L(x, 1), and so L(x, i) + Lh ) lh) = — L(x, I), or 8 (5) ) hb) — = 02 — %. By the ad- 


dition theorem for the tangent function it follows that 
tan o, — tana, 


By interchanging the lines one obtains yp’ = {(/2, 44) = %1 — %2 = —y or y =a — yp. 

This condition includes the condition for the lines to be parallel: from y = 0 it follows that 
m, = m2, which was found earlier. For y = 2/2 
one obtains the condition for the two lines to be 
perpendicular. Since tan y = oo, the denominator 
must be zero, that - 1 + mm, = 0. 

Examples: 1. The lines y — 2 = 5(x — 13) and y = — x/5 + 18 are perpendicular, since in 
dln Carteniat nocinal forma thele equations ate y == 5x G3 andy == <-4/Ss + 18, that is, m, = 5 
is the negative of the reciprocal of m, = —1/5, or mz = —1/m,. 

2. To find ene eee Fd, 1) perpendicular eae m, — — Hm — The 
given line has the gradient my = —2/3, so the required line has the gradient m, = —1 m, = +3/2. 
The point-direction form of the equation is (y — 1)/(x — 1) = +:3/2 or 2(y — 1) = 3% — J), 
that is, y = 3x/2 — 1/2. 

3. The lines y = —2x + 16 and y = —3x/5 + 3/5 cut at an angle y = 32.48°, since m, = —2 
and m, = —3/5 and so tany = (—3/5 + 2)/(1 ae 2 x 3/5) = 7/11 = 0.6364; the given value 
is obtained from the tangent table. If «, = —63.43° had been obtained from m, = —2= tana, 
and «, = —30.96° from m; = —0.6 = tana2, ‘hen yw would have been a2, — a, = —30.96° 
+ 63.43° = 32.47°. 

4. The lines x/4 + »/5 = 1 and x/3 — »/2 = 1 have the normal forms y = —5x/4 + 5 and 
y = 2x/3 — 2 (Fig.). Since m, = —5/4 and m, = +2/3, the angle of intersection p is given by 
tan yp = (2/3 + 5/4)/(1l — (2/3) (5/4)] = AC + 15)/(12 — 10) = 23/2 = 11.5; y = 85.03°. 

Check: tan x, = —5/4,x, = —51.34°; tana, = +2/3,a2 = +33.69°; yp = a2 — a, = 85.03°. 


13.3-2 Graphical representation of the equa- 
tions x/4 + y/S =1 and x/3 — y/2 = 1, and 
the angle of intersection of the lines 


13.3-3 Determination of the angle-bisectors 


13.3. Several lines 295 


If the equation of a line is to be found that passes through a given point and cuts a given line at a 
given angle y, then tan y and m, are given; by solving the formula one obtains the required value 
mz = (m, + tan p)/(1 — m, tan py) and hence the required equation of the line by using the point- 
direction form. 


The equations of the angle bisectors. Two intersecting lines /, and /, have two angle bisectors 5; 
and b, (Fig.). They are defined as the locus of those points that have the same distance from the two 
lines. The Hessian normal form is recommended for this. The lines are given by 


x cosy, + ysing; — p; = 0 and xcosg, + ysing, — pz = 0. 


They determine four sectors; one of them belongs to the two positive half-planes. It is bisected by b, . 
In this sector each point of 5, has positive distances d, from /, and d, from /,. In the vertically oppo- 
site sector the distances of a point P of b, from /, and /, are both negative. In the equations 
d, = &,(x cos y; + y sing; — pi) and dz = €2(x cos p2 + ysing~, — p2), &: = +1 and e. = +1 
have the same sign; it follows from d, = d, that the equation of the bisector b, is x(cos y; — cos 92) 
+ y(sin 9, — sin y2) — (Pp, — p2) = 0. On the bisector b, of the remaining two sectors each point has 
positive distance from one line and negative distance from the other, that is, e, = —eé2 or d; = —dj; 
it follows that b2 has the equation x(cos 9, + cos y2) + y(sin g, + sing.) — (p, + p2) = 0. 


Hunk ic CL ee PT ree — 32 = O are given in the Hessian 
normal form by x/)/2 + »/y/2— 22 —0 and 7x/(5 V2) + yK5 V2) — 32((5 V2) =0. The 
equations of = two angle bisectors are 
x(1/Y¥2 & 7/5 ¥2)) + y/V2 + 1/G ¥2)) — 2/V2 & 32/(5 ¥2)) = 9, 
or aia Laks 1) — (10 + 32) = 0; in Cartesian normal ike these are y = x/2 — 11/2 
and y = —2x+ 


Triangle and polygon 


The area of a triangle. If P:(x1, ¥1), P2(x2, ¥2) and P3(x3, y3) are the vertices of the triangle 
(Fig.), then its area A is known to be A = 1/2 |P;P2|+h3 = 1/2 |P1P2|° *|PiP3|° sina, where hg is the 
distance of P3 from P,P, and « is the angle between the segments P,P, and P,P3 in the sense of 
rotation of the coordinate system. If this sense is the mathematically positive, then P3 lies to the left 
of the segment P,P2; if the perimeter of the triangle is described in the sequence P, —- P, —> P3;, 
then the triangle lies to the left and the sign of the sine function is positive. For the opposite sense 
of rotation of the angle the triangle lies to the right; the oriented area of the triangle is counted as 
negative. 

By the parallel displacement x’ = x — x1, y’ = y — y, One can go over to a coordinate system 
in which P, is the origin; in this system, 92, P2 and 93, P3 are the polar coordinates of P2 and P3; 
then 24 = 0203 sin (p3 — Y2) = 2 COS Y2 * G3 SIN Y3 — Q2 SIN 2 - Q3 COS Q3 = X2V3 — ¥2X3 


X2 y2|_ |x2—-%1 Yea-—N 
4i= 
X3 3 X3—X1 Y3—-)1 


= 11 X2 yY2|)=> 2A. 
1 x3 y3 


=|0 x2—%*%1 ya-y 
0 x3—%1 Y3s—-y 


13.3-4 Area of a triangle 


If this determinant is expanded in terms of the second column or if the two-rowed determinant is 
multiplied out and the terms are reordered, an expression is obtained in which the indices in each 
of the three summands can be interchanged by a cyclic permutation. If P3 lies on the line P,P2, 


296 13. Analytic geometry of the plane 


the area of the triangle is zero; for A = 0 the equation gives 
x3(¥1 — Y2) — Y3(%1 — X2) + 1 y2 — X2¥1) = 0 
or Ys = X3(¥1 — Y2)/(%1 — X2) + O12 — ¥2¥1)/(%1 — 2); 
that is, it becomes the equation of the line through P, and P,2 in the Cartesian normal form, which 
agrees with geometrical intuition. The condition that three points should lie ona line i is A = 0. 


Example: The triangle P,(2, 1), P2(6, 3), P3(4, 7) has the area 
1 2) | | si 
A="/2|1 6 3/= oP D+ 5U —)+4d — 3)) = */2(—8 + 36 — 8) = 10. 
1 Milage SoA fF is 
The triangle P,(—4, —5), P5(5, —3), P.(6, 9) has the area 
A ="/q[—4(—3 — 2) + 5(2 + 5) + (—5 + 3)] = */,(20 + 35 — 12) = 21.5. 

Area of a polygon. In a convex polygon the line segment P, P, joining two arbitrarily chosen interior 
points P, and P, contains only interior points. All the diagonals of an n-gon through one vertex 
lie wholly in the interior and divide the polygon into (n — 2) triangles. Any two adjacent triangles 
have a diagonal as common side and together they cover the whole area of the polygon, irrespective 
of which vertex is chosen (Fig.). If each of the triangles is described in the sense fixed as positive, 
then the polygon is also described in this sense. Each diagonal is described once in one sense and 
then in the neighbouring triangle in the opposite sense. The area of the polygon is the sum of the 
areas of the triangles. 

Fi, 13.3-5 Dividing a convex n-gon 
into (m — 2) triangles 


Ls 


op 


: 13.3-6 Dividing a convex 
Py P, n-gon into a triangles 


If the area is divided into n triangles by lines joining an arbitrary point P inside the polygon to 
the vertices, then again the sense of description of the triangles corresponds to that of the polygon, 
and the interior sides are described twice in opposite senses (Fig.). A non-convex polygon can also 
be divided into triangles by either of the two methods. However, there arise diagonals and ‘interior’ 
sides which contain exterior points of the polygon (Fig.). If the area of a triangle described in the 
sense opposite to that of the polygon is counted as negative, for example, the triangle P,P2P3 in 
the pentagon P,P,P3P4Ps, then again the area of the polygon is the algebraic sum of those of the 
triangles, as long as the polygon is not folded, that is, as long as its sides do not intersect. The quad- 
rangle P, P,P3P,4 in Fig. 13.3-7 is folded. One must ascribe to it an oriented area which is made up 
of one positive (red) area and one negative (blue); a folded ‘parallelogram’ therefore has area zero. 


13.3-7 Dividing a non-convex 
pentagon and quadrangle into 3 
and 2 triangles, respectively 


Centre of gravity of a triangle. In a triangle with the vertices Pi (x1, V1), P2(x2, Ya), P3(x3, 3) 
the midpoints of the sides are M,[*/2(x2 + xs), */o(y2 + y3)], Ma[*/2(x3 + x1), 7/2(¥3 + 1), 
M,Z[' /2(x1 + x2), 11("1 + y2)]. On the medians s, = |\P1M,|, 52 = |P2M3|, 53 = |P3M3| points 

1» Gz, G3 are fixed by means of the ratios A, , A., A; (Fig.); their coordinates are 


13.3. Several lines 297 


pet + Ay(x2 + x3)/2 _ Va tAr2 + 3/2 P, x3, y5) 
. 1+A, 1+”, ? ¥ 

fs 22 + A2(x3 + xi)/2 _ y2 +A2(y3 + y1)/2 
. 1+”; io 

— x3 + A341 + X2)/2 — y3 +4301 + Y2)/2_ 
: 1 + As 1+ A3 


It turns out that under the apparently arbitrary choice 
A, = Az =A3 = 2, the three pairs of coordinates become 
equal, that is, they represent the same point G = G, = G, 
= G3: 


The three medians of a triangle meet at a point G, which 
divides each of them in the ratio |P,G|:\|GM,| = 2:1 and 
is called the centre of gravity of the triangle. 

The coordinates of the centre of gravity are the arithmetic P, (X31 Yo 
means of the coordinates of the vertices of the triangle. 


13.3-8 Centre of gravity of a triangle 


€ | PY et ee pir, : 1 ae ibe oF = he ant thu ES? eek, 
ss yay In ,ir o yt: 9, | ’ i yl c | ‘ILy 
Y beh Pah ‘i -24+ 2/3 =0; es tee a oe ee mT alee ee ~ i 
Bn os i uate Fe Tie’ g tie ja pets ery aes oars == See te eee Ae ‘ 


Theorem of Asides: Before proving this theorem, a lemma on the angle bisectors of a triangle 
is derived. 

In a triangle the bisector of an interior 
angle and the bisector of the corresponding 
exterior angle divide the opposite side in the 
ratio of the sides containing the angle. 


In the figure the bisectors of the angles y 
and »’ of the triangle ABC are denoted by 
b, and b,,. They are perpendicular, since y 
and y’ are supplementary angles. The lines 
AD’ and AE’ parallel to them through A 
cut the bisectors at right angles at D, and 
E,. From the congruence of the two pairs of 
triangles \AD,C = AE’D,C and AAE,C 
= /AD’E,C it follows that |CE’| = |CA| = 
\CD’. The lines BD’ and BE are cut by the 
parallels AD’, DC and by the parallels CE, 
ay Hence by the intercept theorem: 


1. |AD| : |DB| = |D’C| : |\CB| = |CA|: |CB 
and 
2. |AE|: |EB| = |E’C) : |: |CB| = |CA|: |CB|. 
If one € considers the directions of segments on the side c, then AD, {D, DB and EB EB have the same sign, 
and AE the opposite sign. The ratios J in which the points D and E of the bisectors divide the segment 
AB have the same _numerical value, but opposite sign: A, = (ABD) = AD|/DB = +(b:a) and 


Aa = (ABE) = = AEJEB = —(b: a). Two points D and E that divide a segment AB internally and 
externally in the same ratio are called harmonic points and the ratio of the two ratios A, : Az is called 
the cross-ratio (A,B; D,E). For harmonic points the cross-ratio therefore has the value 


—1 = [+(6:4)}: [—@:a)}. 


Theorem of Apollonius. The locus of the vertices C of all triangles ABC with a given side |AB\, 
whose other sides are in a constant ratio |AC|:|BC| = A, is the circle on the segment DEI a8 
diameter, whose end-points D and E divide the side | AB| internally and externally in the ratio A. 


In the figure of the theorem proved above, if the segment AB and its points of division D and E 
are regarded as given, then, apart from the triangle ABC, there are other triangles ABC with the 
property that the bisectors of the angles y and y’ pass through D and E. To be angle bisectors, the 


13.3-9 Bisectors of an interior angle y and the 
corresponding exterior angle »’ 


298 13. Analytic geometry of the plane 


lines CD and CE need only be perpendicular, that is, C must lie on the circle on the segment DE 
as diameter. For all these points C the ratio (6: a) = A of its distances from the two fixed points A 
and B has the same value 4 = |AD| : |DB|. The circle is therefore the locus of all the points C, as the 
theorem of Apollonius states. 


The theorems of Ceva and Menelaus. These theorems are named after Giovanni CEVA (1648-1734) 


and MENELAus of Alexandria (about 98 A. D.). Their dual character (see Chapter 25.) can be seen 
from the following arrangement of the questions (Fig.). 


P; Qs, Q3 Ps a 
13.3-10 The theorem of Ceva 13.3-11 The theorem of Menelaus 


Theorem of Ceva. Under what conditions do | Theorem of Menelaus. Under what conditions 
three lines, each of which passes through one | do three points, each of which lies on one side 
vertex of the triangle but does not coincide with | of the triangle but does not coincide with a 
a side, meet in a point? — vertex, lie on a line? — 


Theorem of Ceva. Three lines, each of which passes through one vertex of a triangle, meet in a 
point if and only if the product of the ratios in which they divide the opposite sides has the value 1. 
If the vertices of the triangle are denoted by P,; , P2, P3 and the points of intersection of the three 

lines with the opposite sides by Q,, Q2, Q3, then they form the ratios: 4; = P,Q, :Q,P3> 


A, = P3Q2:Q02P;; 43 = Pi103:Q03P,. A coordinate system is taken so that the line through P, 
and P, is the x-axis and the line through P, and P; is the y-axis; the point P. is taken as (1, 0) and 


P; as (0, 1). F Then tt the coordinates of of OQ; , Q2, Q; can be calculated as follows: 
P,Q): :O2P3 = 1: A, or P,Q): P; P,P; =1: (1 + Az) = y23 x2 = 0; 
P,Qs: ‘Q3P2 =A; or PQs: P,P, 2 = Az:(1 +43) = X33; y3=0; 
P,Q:: PP = PsQh: -P3P,=1:(1 +A)=x1; 
PO’ :P, P3= P2Q;:P,P3=A;:(1+4,)=y1. 
If x and y are the coordinates of the point of intersection P, then the following three line equations 
must hold simultaneously: 
(1) line through Q,, P and P,: (y,; — 0)/(x, — 0) = y/x or Ay = y/x; 
(2) line through Q2, P and Pz: (y2 — 0)/(x2 — 1) = yx — 1) or [1/1 + g)I: ee = y/(x — 1); 
(3) line through Q3, P and P3: (y3 — 1)/x3 = (y — 1)/x or (—1): [43/1 + iji= = (y — 1)/x. 
By eliminating y from (1), (2) and (3) one obtains 
(2’) — 1/( + Az) = xAy/(x — 1)3. 1 — x = x(Ay + A,A2), x = 1/CL + Ay + A442); 
(3’) — (1 + A3)/A3 = (XA, — 1)/x, x + xAg = As — xdyA3, x = Az/(L + Az + A4Az). 
By eliminating x from (2’) and (3’) one obtains 
As + AAs + AyAgA3 = 14+ A3 + Az, 
A,A2A3 = 1, which was to be proved. 
Theorem of Menelaus. A transversal cuts the sides of a triangle in such a way that the product 
of the ratios in which the points of intersection divide the three sides has the value — 1. 
Just as in Ceva's theorem, let a es eee ee 
Ay = P2Q1:0:P3, A= P3Q, :0.P;:, Az = P,Q3:Q3P2 
be the ratios in which the points of intersection divide the sides; the vertices of the triangle can again 
have the coordinates P,(0, 0), P2(1, 0), P3(0, 1). The coordinates of the points of intersection have 


13.4. The circle 299 


the same values, except that one of the ratios (or all three), A in the figure, has a negative value. 
If the points Q, ,Q2,Q3 are to lie on a line, then 


¥3): (X2 By re ee (x1 — *3), 

ee a il: [—A3/(1 + A3) 

= [A,/( + 4,)]: [11/0 + 2) — A3/Q1 + A3)], 
-(1 + A3)/[A3(1 + A2)] 
=A,(1 + A3)/[1 + A, — A301 + A,)], 
—( + A3) a A3(1 + Ay) _ A,A3(1 ss A2), 
—1—As + Az + AyAs = AAs + AyA2A3, 
—1 = 4,A,A3, which was to be proved. 


13.4. The circle 


Equations of a circle 


Equations of a circle in rectangular coordinates. The circle is the locus of all points P(x, y) of the 
plane that have a constant distance r from a fixed point C(c, d); C is called the centre and r the radius 
of the circle. By the theorem of Pythagoras (Fig.) one obtains 
the required equation (x — c)? + (y — d)? = r’. If the centre is 
at the origin, then c = d = 0, and the equation of the circle is 
x2 + y? = r?, 


Ixy) 


— se 


Example 1: The circle with centre at C(4, 3) and radius 2 has the equation le Sales Ove 

Example 2: The point Po(1, 2) does not lie on the circle (x — 1/2)? + (» — 2)? = 5?, because 
its coordinates x5 = 1, o = 2. do not satisfy the equation of the circle. For (1 — 1 frac reer 

Example 3: What are the ordinates of the points P, and P, on the circle (x — 1)?+ (y—2)?=61 
that have the abscissa 6? — — Required are the ordinates y, and y3 Shee oe eee 
If one puts x= 6 into the equation of the circle and sqlves for one sy— 
= +y([6l — (6 — 1)?], that is, y, = 8, y. = —4. 


The equation of a circle in polar coordinates. The centre C of a circle of radius r has the polar 
coordinates C(@9, Yo). P is an arbitrary point on the circle with the coordinates (9, ¢). In the triangle 
OCP, |CP| = r and |OC!| = Qo have fixed values, and @ varies with the jangle gy between the values 
Omin = |@o — r| and Omax = Go + r (Fig.). By the cosine theorem, 0” + 08 — 2000 cos (yp — Po) = r? 
If the centre of the circle lies on the polar axis and the circle passes through the origin one speaks 
of the vertex position — then since the angle in a semi-circle is a right angle, the simplified equation 
of the circle is @ = 2rcos @. 


p 


13.4-2 The equation of a circle in polar coordinates 


300 13. Analytic geometry of the plae 


Example: If C has the coordinates (4, 30°) and r = 3, the equation of the circle in polar coordi- 
nates is 9? + 16 — 2¢-4cos(g — 30°) = 9, or o* — 8e cos (y — 30°) + 7= 0. 


Parametric representation of the circle. If the two coordinates x 
and y are regarded as functions x = 9;(t), y = 92(t) of one vari- 
able t, then ¢ is called a parameter, and one speaks of a parametric 
representation (Fig.). In physical applications the time is often taken 
as the parameter. For the circle the parametric representation is 
x=c+rcost, y=d-+prsint if the parameter ¢ is taken to be : Cie.d) 
the angle between the positive direction of the x-axis and the radius i 
to the variable point P(x, y). 


A at 
| gr eae 


rcost | 


| 
7 


Example: The circle with centre at c(3, 4) and radius 2 has 13.4-3 The parametric equations 
the parametric representation x = 3+ 2cost, y= 4+ 2sinf. ofa circle 


Circle and line 


Suppose that a circle (x — c)? + (y — d)* =r? and a line (x9 — a)? + Qo — 2 =P? 
y = mx + Zare given. The coordinates xo, yo of a point of inter- *o A Le mee 
section Po must satisfy the equation of the circle and the equation MELO = Ie 
of the line. One obtains the system of equations 
By substituting for yo, squaring and collecting terms together one obtains a quadratic equation of 
the form x2 + 2pxo +q=0 with general solution x9 = —p + V(p? — q) (see Chapter 4.). 
According to the sign of its discriminant D = p* — q it has two real roots (D > 0), one real root 
(D = 0), or two conjugate complex roots (D < 0). Geometrically this means that the line has two, 
one, or no points in common with the circle, that is, it is a secant, a tangent or it misses the circle. 


_ Examples: 1. For the points of intersection of the circle (x — 3)? + (y — 2)? = 40 with the 
line y = —x + 9 one obtains 


to — 3)? + PE ese RAT ie — 3)? + (—x9 + 7)? = 40 
- You —X% +9 : 2x8 20x9 = —18 
Sa i aed 


Yo. = 9, Yor = 8 -- Xo; = 9, Xo: = 1 
The two points of intersection are P,(9, 0) and P,(1, 8). 


2. To find the points of intersection of the line y = —x/2 + */2 V5 with the circle x* + y? = 25, 
one solves the quadratic equation obtained by substitution; often, for simplicity, the index that 
characterizes the points of intersection is emitted. 


x? + x7/4 + 125/4 — (5/2) V5x = 25, 


(5/4) x? — (5/2) V5x = —25/4, The discriminant D=5—5 has the value 
x? — 2x /§ = —S, zero, and the line touches the circle at the 

(x — v5)? =0, point x» = V5,¥yo= 2S. 
x= x, = 5. 3. The line x = 6 has no point in common 


is, there is no real solution. 


Normals to the circle. Geometrically it is well known that a tangent is perpendicular to the radius 
through the point of contact. The line on which this radius lies is therefore the normal to the circle 
at the point of contact. For the point P,(x;, y,) on the circle (x — c)? + (y — d)? = r? the gradient 
of the normal is (y; — d)/(x; — c) and its equation is (y — ¥1)/(x — x1) = (d— y1) (ec — x1) or 
y— yn = (01 — Di — ©) & — x4). 


13.4. The circle 301 


Example: The normal to the circle (x — 2)? + (y — 1)* = 25 through P,(5, —3) has the 
equation y + 3 = [((—3 — 1)/(5 — 2)] (x — 5) or in the Cartesian normal form y = —(4/3) x 
+ 11/3. 


Tangents to the circle. If the point of contact P, (x;, y,) of a tangent to the circle (x — c)? + (y — a)? 
= r? is given, then m, = (y, — d)/(x; — c) is the gradient of the radius to the point of contact, 
and m, = —1/m, = —(x,; — c)/(); — d) is the gradient of the tangent (Fig.). 

In the point-direction form the equation of the tangent is y — y, = —(x, — c) (x — x,)/(); — 4), 
or, by multiplying up and collecting terms together, yy, — y? — yd + yyd = —xx, + x? + cx—cx, 
that is, xx, + yy, — cx — dy = x? + y? — cx, — dy,. If one adds to both sides the expression 
(c? + d? — cx, — dy,) and bears in mind the fact that P, lies on the circle, so that (x, — c)? 
+ (1 — d)? = r?, one obtains the equation of the tangent in the form (x — c) x, —(x—c)c 
+ (y — d)y, — (y — d)d =P? or (x — €) (x1 — ce) + (¥ — 2) (1. — 2) = F?. 


Example: The equation of the tangent to the circle (x — 2)? + (y — 1)? = 25 at the point 
P(5, —3) is (x — 2) (5 — 2) + (vy — 1) (—3 — 1) = 25 or (x — 2) — &y — 1) = 25, or in the 
Cartesian normal form y = (3/4) x — 27/4. 

By means of the differential calculus one can find the gradient of the tangent by differentiating 
the equation of the circle. From (x — c)? + (y — d)? = r? it follows that 2(x — c) + 2(y—d) y’ = 0 
or y = —(x — c)/(y — d). The gradient of the tangent at P,(x,, y,) is therefore 
yy = —(X, — c)/(, — d), which agrees with the value already found. 


13.4-4 Tangent to a circle 


13.4-5 Tangents to a circle 
from a point outside the 
circle 


The tangents from a point to a circle. If C(c, d) is the centre of a circle of radius r and Po(Xo, Yo) 
is a point outside the circle, then there are two tangents from Pp to the circle (Fig.). Their points 
of contact are P,(x;, yi) and P2(x2, y2). The equations 
(— (xy —O+(¥—aA( —D=r? and @—O (2-9 +(y—d)(m—dD=P? 
are satisfied for the coordinates of Po(xo, yo): 

(Xo — c) (1 — c)+ (Yo — 4) (1 — d)=1r?, (Xo — €) (X2 — €) + (¥0 — A) (92 — d) = r?. 

The equation (x9 — c) (x —c) + (yo — a) (y—-D=r? 

therefore holds for the coordinates of both points of contact P;(x,, y,) and P2(x2, y2). The equation 
therefore represents a line that passes through both points of contact. This line is called the polar 
Po Of the pole Pp and is determined by the coordinates (c, d) of the centre C and those of the pole 
Po(Xo,; Yo), in addition to the radius r of the circle. Its points of intersection with the circle are the 
points of contact of the two tangents from Po. From the coordinates of the pole (xo, yo) and one point 
of contact, one can always give the equation of the tangent, for example, in the two-point form. 


Example: To find the tangents from P (3, 5) to the circle with the equation (x + 27+,’ = 5. 
The equation of the polar is (3 + 2) (x + 2) + (5 — 0) (y—0)=5 or y= —x — 1. For its 
points of intersection P, and P, with the circle one has y = —x — | and (x + 2)? + »* = 5, and 
so x? + 4x + 4+ x7 4+ 2x + 1=5 or x* + 3x =0; x, = 0, x2 = —3. The points of contact 
are therefore P,(0, —1) and P2(—3, 2). The tangent that touches at P, has the equation y= 2x — 1, 
and the other is y = x/2 + 7/2. 


302 13. Analytic geometry of the plane 


Two circles 


Points of intersection of two circles. Two circles, whose equations are (x — c,)? + (y — d,)? = r? 
and (x — c2)? + (y — dz)? = r3, can lie in such a position that they intersect in two points, or they 
can touch at one point, or they can be separate, that is, have no common point. 

To find a possible point of intersection Po(xo, yo) one has to solve the system of equations 


(xo — €1)7 + (Yo — 1)? = r3 
(xo — €2)? + (Yo — a2)? = r3 |. 


If it has two real distinct solutions x91, Yo1 and X92, Yo2, then P,(xo1, Yo1) and P2(Xo2, ¥o2) are the 
two points of intersection; in the case of a real double solution the circles touch. If the system of 
equations has no real solution, the circles are separate. 


Example: By subtracting the equations (x + 4)* + (y+ 5)? = 194 and (x — 3)*+(y— 2)? = 40 
of two circles one obtains the equation x + y = 9 of the line which is a common chord of the two 
circles. Its points of intersection with one of the circles are also the points of intersection of the 
two circles; one finds that these are P,(9, 0) and P,(1, 8). 


The angle of intersection of two circles. The angle of intersection of two circles is defined as the 
angle between the tangents at each point of intersection; this has the same value for both points of 
intersection (Fig.). 

Example: The circles (x + 4)? + (»y + 5)? = 194 
and (x — 3)? + (y — 2)? = 40 intersect at the point 
P,(9, 0). The tangents are ? 

(9+ 4) («+ 4+ S(y+ 5) = 194 
and) § (9—3)(x—3)+ (—2)(y—-—2)= 4 
or, in the Cartesian normal form, 
y = —(13/5)x+ 1117/5 and y= 3x— 27. For 
the angle of intersection y of these two lines one 
finds from m, = —13/5 and m,=3 that 
2 = —39.47° and wy, = 140.53°; from m, = tan a, 
= —13/5 one obtains «, = —68.96° and from 
m,=tana,=+3 one obtains «; = +71.57° 
and so y=a,—a, = 140.53°. 


13.4-6 Points of intersection and angle of intersection of —_C/4,-5) 
two circles 


13.5. The conics 


Conics as intersections of a circular cone with planes 


In antiquity conics were defined as intersections of a plane E with a circular cone. The inter- 
sections are called circle, ellipse, hyperbola and parabola. If the intersecting plane E contains the 
vertex Z of the (double) cone, the intersection is either a point, the vertex Z, or a generating line, 
if E touches the cone, or two generating lines intersecting at Z, if the plane contains interior points 
of the cone. These intersections are known as degenerate conics. 

If E.does not contain Z, but is perpendicular to the axis of a right cone, then the intersection is a 
circle; if E is parallel to a tangent plane, the intersection is a parabola; if E is neither parallel to 
a tangent plane nor perpendicular to the axis, then the intersection is an ellipse if it intersects all 
the generators on the same side of Z, and a hyperbola otherwise. 

All non-degenerate conics can be regarded as perspective images of one another; Z is the 
centre of perspective. On each generator there lies one and only one point of each conic, apart 
from three exceptions, which can be eliminated by means of the concept of the improper point or 
point at infinity of a line (just as in the ratio of division). For the parabola, the generator go parallel 
to E has a point (at infinity) in common with the line through the vertex V of the parabola parallel 
to it, and this is a point of the parabola; the parallel line VF through the vertex of the parabola 
is called the axis of the parabola. In the case of the hyperbola there are two generators in which the 
plane E’ parallel to E through Z cuts the cone. In projective geometry the two planes E and E’ 


13.5. The conics 303 


have a line (at infinity) / in common, and each of the two generators cuts / in a point (at infinity), 
which is a point of each of the lines parallel to the generator and is a point of the hyperbola; in 
particular, the two lines parallel to these generators through the centre of the hyperbola are called 
its asymptotes. 


The Dandelin spheres. Pierre DANDELIN (1794-1847) was the first to use the spheres that touch 
the cone and the intersecting plane E to derive properties of the conics. 

The parabola. If E is parallel to a tangent plane of the cone that touches it along a generator go, 
then there is only one Dandelin sphere that touches the cone and E; its diameter is the distance of 
20 from E (Fig.). The sphere touches F at a point F and the cone along a circle c, which meets go 
in a point D. The plane E, through c cuts Fina line /, which is called the directrix and is perpendicular 
to the plane 2 through go and the axis of the cone. The plane 2 cuts E in the axis of the parabola; 
the (finite) point of the parabola on the axis is the vertex V of the parabola. By rotating 2 about 
Zo one obtains lines of intersection with FE, for example, BP, parallel to the axis of the parabola, 
that is, perpendicular to the directrix. The plane 2’, obtained by rotation about go cuts the cone 
in a second generator through the points Z, A and P; A lies on c and P lies on the line of inter- 
section; P is therefore a point of the parabola. The segments PF and PA are of equal length as 
tangents from P to the sphere. The segments 
PA and PB are of equal length, since the lines 
ZP and DB intersect at A and are cut by 
the parallels BP and DZ, so |BP|:|PA| = 
|DZ|:|ZA| = 1, that is |BP| = |PA|. 


The parabola is the locus of all poifts P of 
the plane that have the same distance from a 
fixed point F and a fixed line /. The ratio 
|PF|:|PB| =e has the value 1 and is called 


- 


| axis of the cone 


13.5-1 Parabola as section of a cone 13.5-2 Ellipse as section of a cone 


The ellipse and hyperbola. If E is not parallel to a tangent plane of the cone, there are two Dandelin 
spheres that touch E at points F, and F; and touch the cone along the circles c, and c2, respectively. 
The planes FE, and E, of these circles cut E in a pair of parallel directrices /; and /,. The plane 
perpendicular to the directrices through the axis of the cone cuts E in the axis of the ellipse or 
hyperbola (Fig.). 

A line go in this plane 2 parallel to the axis of the conic through the vertex Z of the cone cuts 
the planes E, in D, and E, in D,. If X is rotated about go, then its line of intersection with E, say 
B,B2, remains parallel to the axis of the conic, that is, perpendicular to the directrices /, and /). 
The plane 2, contains a generator ZA,A2; the point of intersection P of B,Bz with A;Az2 is a point 
of the conic. The segments PF; and PA, are of equal length as tangents from P to the sphere S, ; 
similarly, |PF2| = |PA2|. Because of the position of S,; and S, on opposite sides or on the same side 
of E, one has, for the ellipse |PF,| + |PF2 |= |A1,A2|, and for the hyperbola |PF,| — |PF2|=|A1A2|. 


304 13. Analytic geometry of the plane 


The ellipse is the locus of all points P of the plane for which the sum of the distances from two 
fixed points F, and F, (the foci) is constant; by symmetry this constant (2a) is equal to the distance 
between the points F’, and F’, of the ellipse that lie in 2, which are called vertices. 

The hyperbola is the locus of all points P of the plane for which the difference of the distances 
from two fixed points F,; and F, (the foci) is constant; for the vertices V; and V, in 2 one again 
has |V, V2| = 2a = |A,A)|. 

In the plane 2, that arises by rotation about go the lines ZP and B,D, intersect at A,, and B,P 
is parallel to ZD, . Hence |PA,| : |PB,| = |ZA,|:|ZD,| = |PF;,|: |PB,| = e. 


The ratio of the distance |PF,| of a point P on a conic from a focus F;, to its distance |PB,| from 
the corresponding directrix /, is a constant e, the numerical eccentricity; for the ellipse 
0< e< 1(|ZA,| < |ZD,|), and for the hyperbola e > 1(|ZA,| > |ZD,)). 


The equations of the conics. To arrive at an analytic expression for the conics a suitable coordinate 
system must be chosen. From its definition, a conic is symmetrical about its axis. In addition, the 
ellipse and hyperbola, from the considerations above, must also be symmetrical about the perpen- 
dicular bisector of F,F 2; the point of intersection of this perpendicular bisector with the axis is the 
centre C of the conic. Hence the best coordinate system for an ellipse or hyperbola is a Cartesian 
system with the x-axis as the axis of the conic and the y-axis as the perpendicular line through the 
centre C. One then says that the conic is in its central position. One speaks of the vertex position if 
the x-axis is the same, but the y-axis is the tangent at a vertex. Also polar coordinates, where the 
axis of the conic gives the zero direction and a focus is the pole, are suitable for all three types of 
conic, and they then have a common equation. For the hyperbola, a natural oblique coordinate 
system is formed by the two asymptotes, which intersect at the centre. 


13.5-3 Hyperbola as section of a cone 


13.5-4 Derivation of the equation of a parabola 


Equations of the parabola 


The vertex equation. The Cartesian coordi- 
nate system is such that the x-axis is the axis 
of the parabola and the y-axis is the tangent at 
the vertex (Fig.). From the definition of the 
parabola, each of its points P has the same 
distance from the focus F and the directrix /. 
The vertex V must therefore bisect the per- 
pendicular FL, from F to /. There are two points on the parabola whose ordinates are equal 
to the distance from the directrix. The absolute value p of this ordinate is called the semi- 
parameter of the parabola; |Lo>Fj= p. The focus F therefore has the coordinates (p/2,0). An 


13.5. The conics 305 


arbitrary point P(x, y) of the parabola has the distance |FP| = /[y? + (x — p/2)?] from the focus 
F and the distance | PL| = p/2 + x from the directrix. By the definition of the parabola, 


(RP = 7 Opa? or y= 2px, RRRSTRRAON oT RT SEA] 


The equation shows that the x-axis is an axis of symmetry; the vertex is at the origin; for each 
abscissa x > 0 there are two points of the parabola whose ordinates are equal and opposite. The 
semiparameter determines the form of the parabola. The smaller the value of p is, the nearer the focus 
and directrix come to the y-axis and the more slowly y increases. In the limit, as p + 0, the parabola 
degenerates to the positive x-axis counted twice. On the other hand, if p takes a very large value, 
then the focus and directrix are at a large distance apart, and as p > ov, the parabola degenerates 
to the y-axis, since x > 0. 

The equations x? = 2py, y? = —2px and x? = —2py with p > 0 also represent parabolas, as 
can be seen from the diagram (Fig.), in which the parabola 7? = 2pé goes over into one of the given 
equations by a suitable rotation of the £, 7-coordinate system through an angle y. The equations 
of transformation are § = x cosy — ysiny and 7 = x siny + ycosy. 


13.5-5 Positions of the parabola 7? = 2pé under rotation of the é, n-system 


Transformed 
equation 


Equations of 
transformation 


Parabola 


If, after a parallel displacement of the coordinate system, the vertex has the coordinates (c, d), 
then the equation of the parabola takes one of the following forms (p > 0): 


(y — d)? = 2p(x — ©); (x — c)? = 2p(y — d); 
(y — d)? = —2p(x— cc); (x—c)? = —2p(y — a). 


Example 1: To find the equation of the parabola, in the vertex position with x-axis as the axis 
of the parabola, that passes through the point Po(2, 4). -— The equation of the parabola must be 
satisfied by the coordinates of Py: 47 = 2p-2; hence p = 4. The equation of the parabola is 
therefore y? = 8x. 

Example 2: The parabola that has its vertex at V(2, 3), is concave downwards, and passes through 
the point Po(4, 1), must have an equation that is a transformation by parallel displacement of axes 
of the equation x? = —2py; its equation is (x — 2)* = —2p(y — 3). Since it passes through 
the point Po(4, 1), one has (4 — 2)* = —2p(1 — 3), or p = +4/4 = +1. The equation of the 
parabola is therefore (x — 2)? = —2(y — 3). The focus is at a distance '/2p = 1/2 from the ver- 
tex along the axis, that is, its coordinates are F(2, 2.5). 


Equations of the ellipse 


The central equation of the ellipse. The x-axis coincides with the axis of the ellipse, and the y-axis 
with the perpendicular bisector of the segment VV. between the vertices (Fig.). The y-axis cuts 
the ellipse in two points N, and N,, the secondary vertices. The length |V,V2| = 2a is called the 
major axis, the length |N,N2| = 26 the minor axis and |F,F2|/2 = e the linear eccentricity. Since 
|NiF,| + |N,F2|= 2a, the segments a, b and e form a right-angled triangle, and so e? + b? = a?. 


306 13. Analytic geometry of the plane 


The foci therefore have coordinates F,(-++e,0) and F2(—e, 0). An arbitrary point of the ellipse 
P(x, y) has the distances |PF,| = r, = V[y? + (e — x)?] and |PF2| = r2 = V[y? + (e + x)?] from 
the foci. From the definition of the ellipse, r,; + r2= 2a 
or r; = 2a — r2. If one substitutes for r; and r2 and 
squares both sides, ane square root remains: 

xy +(e — x)? = 40? — 4a [y? + (€ + x)?] 

+ # +(e+ x) 

or a V[y? + (e + x)?) = a? 4+ ex. 
By squaring both sides one obtains 
a*y? + ate? 4+ Boter 4+ a*x*? =at+ Dotex + e?x?, 
Since e? = a? — 5? this equation can be simplified: 
aty? ++ @® — qth? + wet = * 4+ tx — 5x? 
or x*/a* + y*/b* = 1. 


13.5-6 The equation of an ellipse in cen- 
tral position 


The equation x?/b? + y?/a? = 1, where a > 5, also represents an ellipse, as is seen by rotating 
the &, n-coordinate system through an angle y = —z/2. Under the transformation (see Equations 
of the parabola) = y, 7 = —x, the ellipse €2/a? + ?/b? = 1 goes into x?/b? + y?/a? = 1. 

If after a parallel displacement of the coordinate system the centre of the ellipse has the coordinates 
(c, d), then the central equation of the ellipse, for a > 5, takes one of the following forms: 
(x — c)?/b? + (y — d)?/a? = 1. 


er 


i ae 
La LOT 


13.5-7 Rotation of the é, 7-system 
and the ellipse x*/a® + y?/b? = 1 


13.5-8 Numerical eccentricity of an ellipse 


The numerical eccentricity. A line parallel to the y-axis of an ellipse at a distance |CL,| = a?/e 
from it is called a directrix /, . The vertex V, is at a distance d’ from it, where d’ = |V,L,| = (a?/e) — a 
= (a/e) (a — e), and an arbitrary point P of the ellipse is at a distance d from it, where d = |PQ| 
= a*/e — x. For the distances r, and r2 of the point P from the foci F, and F, one has, by the theorem 
of Pythagoras, r3 = r? + (2e)? — 2: 2e(e — x) = r? + 4e? — 4e? + 4ex or r3 — r?2 = 4ex. Since 
ro +r, = 2a, it follows by division that r. —r, = 2ex/a, and so r2—a-+ex/a and 
r, =a -— ex/a. If one then substitutes, from the last equation, x = (a? — r,a)/e in the expression 
for d, one obtains d = r, ° (a/e), that is, the ratio d: r,; = a: e is independent of the chosen point P. 
Its reciprocal « = e/a is called the numerical eccentricity. In the figure it is shown how to construct ¢ 


13.5. The conics 307 


as the ratio of two segments: R; is determined on F,P by |PR,| = r,, and R2 on F,R,; by CR; || F,R:; 
then |CF,|:|CR2| = e:a=e. 


The ellipse is the locus of all points P of the plane for which the ratio r, : d of the distance r, of P 
from a focus F, to its distance d from the corresponding directrix /, has the constant value e = ¢: a. 


Parametric representation of the ellipse. The ellipse can 
be regarded as the affine image of the circle £7 + 1? = a? 
by reducing all the ordinates in the ratio y:7 = b: a; in 
this case, the transformation § = x, 7 = ya/b takes ‘the 
circle into the ellipse x?/a” +- y?/b? = 1. The construction 
is carried out by taking the points of intersection A and B 
of a ray through the origin with the circles c, and c, of 
radii a and 6, drawing lines through A and B parallel to 
the axes and taking the point P of the ellipse as the point 
of intersection. One sees that the condition y:7 = b: a is 
satisfied. If t is the angle that an arbitrary ray makes with 
the x-axis, one obtains the parametric representation 
x =acost, y = bsin t, which satisfies the equation of the 
ps (Fig. ). 


13.5-9 Parametric representation of an 
Equations of the hyperbola ellipse 


Like the ellipse, the hyperbola is symmetrical about the axis through the vertices V, and V2 
and about the line perpendicular to this axis through the centre C, |CV,;| = |CV2|. The segment IC V;| 
is denoted by a, and the segments |CF,| = |CF,| by e. The hyperbola does not have a minor semi- 
axis; since e > a, there is, however, a segment b given by b? = e? — a?. 

Central equation of the hyperbola. Because of the symmetry of the hyperbola, there is a particularly 
suitable Cartesian coordinate system in which the x-axis coincides with the axis of the hyperbola 
and the y-axis is the line perpendicular to it through C. The foci have the coordinates F,(-+-e, 0) 
and F,(—e, 0) (Fig.). The distances of an arbitrary point P(x, y) of the hyperbola from the foci are 

|PF,| =r, = Vly? + * — e)?] 
HE HS HN Pg and = | PF2| = r2 = V[y? +-(« + e)?]. 
Hass G Jf From the definition of the hyperbola, 
wr 


| +44 == i : 
SAE Fo —My=2a or r2=2a+nr,. If one 
md a Bd ___ Substitutes the expressions for r; and r2 and 


7] ~~ squares both sides, one square root remains: 
Aes Cz ee fi BH aw” + (x + e)? =4a? + 4a V[y? + (x— e)?] 


a Ep WL fe +e + ee 
i ane Sear or = ex — a? =a Vy? + (x — 6]. 
jee 7H i ESSE Eg _ Squaring both sides again, one obtains 
242 4 a4 — Qelew = ay? + atx? 

popes SBS ts — Peter =f a*e?, 


—- Since e? = a? + b?, this equation can be 


t 
| 


simplified: 
13.5-10 The equation of a hyperbola in central position x + 62x? + «*” = a?y? + -@dy 
— + a*b? 
or x? a? — y?/b? = 1 


The significance of the number b can be realized om the banat bbb sate of the equation: 


a*y? = b?(x? — — @ -) a. ar ae oe tal 
vie = £(b]a) VIL a2/x2}, _-'__Central equation of the hyperbola _| x?/a* — y?/b* =1_| 


The limiting value of this expression, as x — ©0, is lim y/x = +b/a. The lines 7 = +(6/a) € having 


x= CO 
these limiting values as gradients are the 
asymptotes of the hyperbola. 


By symmetry it is sufficient to consider the behaviour of t the e hyperbola x?/a? — — yb = 1 and 
the line 7 = (6/a) & in the first quadrant. If the perpendicular is dropped from a point (¢, 7) with 


308 13. Analytic geometry of the plane 


€ > a onto the x-axis, it cuts the hyperbola in a point P(x, y). Then  =.x, and from 7 = (b/a) x 
and y = (b/a) x y[1 — a?/x?] it follows that y < 7, because the factor /[1 — a?/x?] is less than 1. 
The larger x, the smaller the difference 7 — y, since it follows from the equation of the hyperbola 
that [x/a — y/b] = [x/a + y/b]-* and since [x/a + y/b] > 00, as x 00, lim [(x/a — (y/b)] = 0 


x=» 0 
or [(b/a) x — y] = (n — y) > 0, as x > ov. For large values of x the hyperbola therefore comes 
arbitrarily close to the line 7 = (b/a) €. The line is an asymptote of the hyperbola. Its angle of in- 
clination to the x-axis is obtained from a right-angled triangle with the sides a and b and hypo- 
tenuse e. 

The equation y?/a? — x?/b? = 1, where a is the principal semi-axis and lies along the y-axis, 
represents a hyperbola (Fig.), as one sees by rotating the &, 7-coordinate system through an‘angle 
= —/2. Under the transformation € = y, n = —x (see Equations of the Parabola) the hyperbola 
/a? — ?/b? = 1 goes into y?/a? — x?/b? = 1. 

If after a parallel displacement of the coordinate system the centre of the hyperbola has the 
coordinates (c, d), then the central equation of the hyperbola takes one of the following forms: 


(x — ¢)?/a” — (y— d)?/b® = 1 or (y—)?/a? — (x — ¢)?/b? = 1. 


Example: The hyperbola x?/25 — y?/4= 1 has the vertices V,(5,0) and V2(—5,0), foci 
F,(/(25 + 4), 0) and F2(—//29, 0) and asymptotes y = +(2/5) x. 
The hyperbola y?/25 — x?/4 = 1, on the other hand, has the vertices V3(0, 5) and V,(0, —5), 
foci F,(0, 29) and F,(0, —//29) and asymptotes y = +(5/2) x. 
13.5-11 Rotation of the ay ; 
ERY é, n-system and the hyper- |! - 
/ bola x*/a? — y?/b? = 1 


fe 


BAt 
Tt f ee Py - isl 
}"-3 | | : - yy por 
Lm Q Py / ele 
7 x FIG E VAF 
A i 
13.5-12 The asymptotic 
equation of a hyperbola x 


The asymptotic equation of the hyperbola. The equation of the hyperbola turns out particularly 
simple if its asymptotes ar taken as the axes of a coordinate system (Fig.). Let € and 7 denote the 
coordinates in the original system of rectangular coordinates, and x and y the coordinates referred 
to the asymptotes as oblique axes. Then the equation of the hyperbola in the central position is 
&2/a? — n*/b*= 1, and 7 = +(b/a) & are the equations of its asymptotes. If tan « = b/a, the fol- 
lowing relations hold between the coordinates: 

n= ysin« — xsine Se | = (y — x) sina 
E&€=xcosa+ ycosa E=(y+ x) cosa}. 


Substituting in the central equation, one obtains a ae 

[(y + x)? cos? «]/a? — [(y — x)? sin? «]/b? = 1 , 

or (y+ x)? b? cos? « — (y — x)? a? sin? « = a’b?. af 

Since b? cos? « = a? sin? «, this gives me 2 

2xy(b? cos? « + a? sin? «) = a?b? and so rs : 

4xy cos? « = a? and 4xy sin? « = b?. By addition . : ry 

one obtains 4xy = a? + 5b? and since sin 2« = oa Ari . 
2ab/(a? -+ 6b?) one has finally xy sin 2x = ab/2. This esa Zee LTS 


equation states that the parallelogram OAPB always fo 5 Mr. fF 
has the same area. | 


The equation of a hyperbola referred to its 
asymptotes has the form xy = const. Conversely, : 
any function of this type represents a hyperbola. | 


ee 13.5-13| Numerical eccentricity of a hyperbola 


13.5. The conics 309 


The numerical eccentricity. A line parallel to the y-axis of a hyperbola at a distance |CL,| = a?/e 
is called a directrix /, (Fig.). The vertex V, has the distance d’ from it, where d’ = |V,L,| =a — a?/e 
= a(e — a)/e and an arbitrary point P of the hyperbola has the distance d from it, where d= x — a?/e. 
In the triangle F,F,P, by the theorem of Pythagoras, r3 = r? + (2e)* — 2+ 2e(e — x) or r3 — r? 
= 4ex. Since r, — r; = 2a, one has rz + ry; = 2ex/a and so r2 = a+ ex/a and r,; = ex/a — a. 
By substituting x = (r,a + a?)/e in d = x — a?/e one obtains d = r,a/e, that is, r;:d=e:a=e. 
This constant «é is called the numerical eccentricity. 


The hyperbola is the locus of all points P of the plane for which the ratio r,; : d of the distance r, 
of P from one focus F, to its distance d from the corresponding directrix /, has the constant value 
e-—e:a. 

If R, is determined on F,P by |PR,| =r, and R, on R,F, by CR; || FP, then |CF,|:|CR2| = e:a 


Conic and line 


Points of intersection of conic and line. In the derivation of the conics by means of the Dandelin 
spheres as the plane section of a circular cone it is clear that to each point P of the circle of contact c 
of a Dandelin sphere there corresponds a point P’ of the conic. To a line / in the plane E, or E2 of 
the circle there corresponds a line /’ in the plane of intersection E, which arises as the line of inter- 
section of this plane E and a plane determined by the line / and the vertex Z of the cone. As in any 
projective mapping, points of intersection in E, or E, go into points of intersection in the image 
plane E. According as the line | cuts, touches, or misses the circles c, or cz, the image line I’ is a secant 
of or a tangent of the conic or has no point in common with it; there can also be points of the circle 
whose images are improper points of the conic, which must therefore be treated separately. 


1. In the case of the parabola, this is the point D of c; it has the greatest distance from the directrix / 
(see Fig. 13.5-1). Each secant through D of the circle c has a line parallel to the axis of the parabola 
as image (for example, DA has the image BP), and therefore cuts the parabola in only one finite 
point P. The tangent at D to the circle c is parallel to the directrix, its image is the line at infinity of E. 

2. In the case of a hyperbola, a plane through the vertex Z of the cone parallel to the plane E 
cuts the circle in two points P, and P3 whose images are the points at infinity on the asymptotes. 
The line through these two points has the line at infinity of E as image. A secant of the circle c, or 
C2 through one of these points, say Pz, has a line parallel to the asymptote as its image, which cuts the 
hyperbola in only one finite point. The tangents to the circle at P, and P3 go into the asymptotes. 
Their point of intersection is therefore the inverse image of the centre C of the hyperbola. 

In finding the points of intersection of a conic with a line it is easy to see from the Cartesian 
normal form of the line equation whether the special case of a secant with only one point of inter- 
section occurs (m, = 0 for a parabola, m, = +(b/a) for a hyperbola). In all other cases, by sub- 
stituting for y from the line equation in the equation of the conic a quadratic equation is obtained, 
whose discriminant gives information about the number of points of intersection. 


Example |]: For the coordinates of the points of intersection of the line y = —x/2 +- 2 with the 
parabola x* = 4y one has to solve the system of these two equations. By substitution one obtains 
x?=-—2x+8 or x?+2x+1=39, 
xX; =2, X»=—4 and y, = 1, y2=4. 
The points of intersection are therefore P,(2, 1) and P2(—4, 4). 
Example 2: The conic 16x? + 25y* + 32x — 100y — 284 = 0 lies parallel to the axes, since 
there is no mixed term in its equation. The equation can be written as 
16x? + 32x + 16 + 25y? — 100y + 100 — 16 — 100 — 284=0 
or 16(x + 1)? + 25(y — 2)? = 400 or (x + 1)?7/25 + (y — 2)7/16= 1. 
The centre C of the conic has the coordinates (—1, 2). The line 5y = 28x — 62 cuts the ellipse 
in the points P,(2, —6/5) and P,(3, 22/5), since by substituting the equation of the line into the 
equation of the ellipse one obtains the quadratic equation x* — 5x + 6=0 with the roots 
Xe = 2. Xs = 3. 


Gradients of tangents. Equations of tangents to a conic. The equations of tangents to a parabola, 
ellipse and hyperbola can be obtained by the methods of analytic geometry, just as for a circle. 
However, it is much more advantageous to use the method of the differential calculus. The derivative 
of the equation of a conic at x, gives the gradient of the tangent to the conic at the point P,(x;, 1), 
where y, is the value of the function, that is, the ordinate of the conic at x,. Taking into account 
the equation of the conic, the point-direction form of the line equation gives the equation of the tan- 
gent: for example, for the parabola y? = 2px one obtains by differentiation 2yy’ = 2p or y’ = p/y 


310 13. Analytic geometry of the plane 


and so the gradient y, of the tangent (at the point P,) is y; = p/y, and the equation of the tangent is 
Y= (y—yd/&— x1) or (p/y1) x — ly) 1 = y¥— 1, 
px — PX; = YY; — Yi = YY, — 2px,, thatis, p(x + x1) = yy. 
For the ellipse and hyperbola, if P(x; , y,).is the point of contact, 
xt/a? + y?/b> = 1; 2x, /a? + 2y1y4/b>=0; ys = F(6?x1)/(2?y1) 
is the gradient, and the aie of the tangent is 


Wie — x)= a> F(xx1/a") (67/71) + (27 /a°) Cie =y—-Nn, 
(x, a2) - = (yb?) = = x}/a* + y3/b> or xx,/a + yy1/b* = 1. 
Corresponding derivations can be made for the parabolas x? = 2py, y? = —2px, and x? = —2py, 


the ellipse x?/b? + y?/a? = 1 and the hyperbola y?/a* — x?/b? = 1. The following table contains 
ime sagen for aad most important cases. 


Tangent at the point P,(x,, ¥;) 
gradient equation 


Ge - dy? = 2p(x —c) 


(x-OCi-—o) , Y—DOr—4) _ 


a= 


a rs) 


(x—c)(4,—c) (y—d)(¥1—4) _ 
“gee: aoe b? Me 


aagaaS TNE eee So caEERE EON Ti 
equation of the tangent to the ellipse 
(x + 1)?7/25 + (y — 27/16 = 1 
at the point P,(2, —6/5) is 
(x + 1) (4, + D/a? + (y-—2(n — D/o? = 1 
Or 
pe +1)Q+ Oe ree Fi 2) = 400, 
x + 48 — BOy 
Bi = (3/5) x — DIS. 


The angle of intersection of a conic and a line. 
This angle of intersection is defined as the angle 
between the line and the tangent to the conic at 
the appropriate point of intersection; it can be 
calculated as the angle between two lines. The line 

= —x/2-+ 2 and the parabola x? = 4y intersect at 13.5-14 Point of intersection and angle of inter- 
the point P,(2, 1). The tangent ¢, to the parabola section of a line and a parabola 


13.5. The conics 311 


at P, has the equation y = x — 1; its angle of inclination «, to the x-axis has the value «2 = 45°, and 
that of the line is«, = —26.56°; the angle y between them is therefore y = «2. — «, = 71.56° (Fig.). 

Tangents to a conic having a prescribed gradient. The gradient with the given value m, must be 
equal to the gradient y, of the conic. In the case of the parabola the direction given by m, must 
not be that of the axis, since no tangent of the parabola has this direction. From y; = m, = p/y, 
one obtains one coordinate y; = p/m, ; the other can be calculated from the equation of the parabola, 
since the coordinates of the point of contact must satisfy it. In the case of an ellipse or hyperbola 
(m, = +b*x,/a*y,) one obtains the ratio of the coordinates x,, y, of the point of contact and by 
substituting in the’ equation of the conic one obtains x? = a*m?/(a?m? + b*), a pure quadratic 
equation which, for ail ellipses and for those hyperbolas for which |m,| > 6/a, has two nume- 
rically equal roots. The result corresponds to the geometrical property that for any direction there 
are two parallel tangents to an ellipse, whose points of contact are symmetrical about the centre, 
but for a hyperbola this case only happens when the line through the origin in the given direction 
lies outside the region between the asymptotes. If the required tangent is parallel to a line y = mx + c¢, 
then m, is determined by m, = m; if the tangent is perpendicular to the line y = mx + c, then 
m, = —1/m; finally, if the two lines enclose an angle y, then from tan y = (m, — m)/(1 + mm) 
one obtains the value m, = (m + tan y)/(1 — mtany). 


Example: If the ellipse 36x? + 100y* = 9 and the line y = —(4/5) x are given, then m= —4/5. 
x : : : 

ince ———— mops oo have the len a= 1/2 and db = 3/10. 
For a tangent parallel to the line, —4/5 = —(9/25) (x,/y,), that is, y; = (9/20) x,. Further, the 
points of contact of the tangents lie on the ellipse, that is, 36x7 + 100y{ = 9. From these two 
equations one obtains by substitution 36x? + (100- 81/400) x? = 9 or x? = 36/225 and so 
X1,2 = +2/5,y¥1,2 = +9/50. The points of contact are therefore B,(2/5, 9/50) and B,(— 2/5, —9/50); 
the equations of the tangents are y = —(4/5) x + 1/2 and y = —(4/5) x — 1/2. 


Any tangent to a hyperbola forms with the asymptotes a triangle (P,MP3) of constant area 
The tangent xx,/a? — yy,/b? = 1 at the point P,(x;,y,) of the hyperbola x?/a? — y?/b? = 1 
cuts the asymptotes y = +(b/a) x at the points P, and P; (Fig.). For their coordinates one finds 
by substitution: 
a P; x[x1/a? = y,/(ab)] =1 or x2,3 = a7b/(bx, F ay) 
S and y[+x,/(ab) — y,/b7] = 1 or 
y2,3 = ab?/(+bx, — ay). 
For the area A of the triangle MP3P, one obtains: 


1 0 0 
A='/,|1 x3 ys 
1 x2 y2 
1 O 0 


a */> 1 a*b/(bx, + ay1) ab? /|(— bx, — ay;) 

1 a*b/(bx, — ay) ab?/(bx,; — ay) 
"12 [a°b*/(b?x? — a?y?) +- a3b3/(b?x? — a?y?)] 
= 0°63/(b?x} — a®y}) = abj[x}/a? — y2b*] = ab. 


Normal and Polar of a conic 
13.5-15 Triangles formed by tangents to a 
hyperbola and the asymptotes Gradients of normals. Equations of normals. A nor- 
mal is the line perpendicular to the tangent at its 
point of contact P;(x:,¥,). One can therefore use the table of gradients of tangents to the 
conic to obtain the gradient of a normal and so, by means of the point-direction form, obtain the 
equation of a normal. In the most important cases one has: 


312 13. Analytic geometry of the plane 


bola 16 — 2/9 = 1 at the point | | Raa 


P,(5, —9/4) has the equation y + > 


16(—9/4) Apitak 9 
ee Oe ee ek 


4 pee PI, Se ue. 
st eee > Fis.). 


“normal | 
a _ 25) ™ 
Oe ae ee 


13.5-16 Normal to a hyperbola ~ 


Important theorems on normals to a conic. 


The line joining a point P, of a parabola to the focus and the line through P, parallel to the axis 
make the same angle with the normal at P,, since they make the same angle with the tangent at P,. 

The normal at a point P, of an ellipse bisects the angle between the lines joining P, to the foci, 
since these lines make the same angle with the tangent at P,. 

The tangent and normal to an ellipse at P, and the lines joining P, to the foci form a harmonic 
pencil, since in any triangle F,\F,P, the internal and external bisectors of one angle (at P,) are 
harmonically conjugate to the two sides that form the angle. 

The tangent and normal to a hyperbola at P, and the lines joining P, to the foci form a harmonic 
pencil, since the tangent and normal at P, are the internal and external bisectors of the angle in P; 
of the triangle F,F,P,. The normal bisects the angle supplementary to the angle between the lines 
Joining P, to the foci. 

The equation of the polar. Just as for a circle, one can 

find the tangents to a conic from a point Po(xo, yo) 
outside by means of the polar po of the pole Py. The 
polar Po is defined as the line joining the points of contact 
P,(x1, y1) and P2(x2, y2) of the tangents t, and t, (Fig.). 
From the equations of the two tangents, for example, of 
an ellipse in the central position: 
(ty) x1x/a? + y,y/b? = 1 and (t2) x2x/a? + yoy/b? = 1 
one obtains the equation (po) xxo/a? + yyo/b? = 1, 
which is the equation of the polar, because it is a line 
equation that is satisfied by the coordinates of the points 
P,(x1, ¥1) and P(x2, y2). 

The equation is formally the same as that of a tangent, 
but the constants x9, yo are the coordinates of the pole, 
not those of the point of contact. If the equations of 
13.5-17 Tangents froma point P, outside the polars of other conics are derived in the same way, 
an ellipse; polar p, the following table is obtained. 


A line g that has no point in common with a conic, for example, a hyperbola, can also be inter- 
preted as the polar of a point Q in the interior of the conic. If two distinct points Q,(€;,1) and 
Q2(€2, 2) of g are taken as poles, then their polars g, and q2 have the equations é 1x/a* —n,y/b? = 1 
and €,x/a” — n2y/b? = 1. From these equations one can calculate the point of intersection Q(xo; Yo) 
of the polars g, and q2. This gives xx9/a” — yyo/b? = 1, which is the equation of a line that passes 
through Q, and Q, and is therefore the polar g of Q. 


13.5. The conics 313 


Example: The equations (x — 2)?/16 
7 determine a | 


. The polar gq; 
Q, has the equation (x — 2) (@, — 2/16 
= +3) + 3)/100=1, and since 
Se poker takes the form 
y= Syx— 1 le. 
the polar qg2 of Q2 has the 
equation 


100(x — 2) B"/i13 — 2) 
— 16(y + 3)(—1279/,13 + 3) = 1600 


or 

y = —*3/52x + 974/52. 

The coordinates of the point of inter- 

pasa of qy and q2 are given by 
; one finds that xo 

= =i. oy atc Oe oebtican 

of the pole Q of the line g. In the 


‘y 


triangle Q0,Q2, each vertex is the pole 
of the opposite side. 

8; 

= r 13.5-18 The hyperbola 


given by (x — 2)?/16 — (y + 3)*/100 = 1 


Tangents from a point to a conic. If the tangents from a point P outside a conic are to be drawn, 
it is convenient to take the points of intersection of the polar p of P with the conic. These points 
B, and B, are the points of contact of the tangents. 


at B,(8, —3) and Ba(6, 4), reoged 

4+ 4(49x2/4 — 175x + 625) = 100, x? — 14x 

= —48, x; = 6, x2 = 8; PF a EE 
Hence the equations of the tangents 

are y = (2/3) x — 25/3 and 

y = —(3/8) x + 25/4. 


13.5-19 Tangents from a point P to an ellipse 


Two conics 


The points of intersection of two conics. To determine the points of intersection of two conics the 
corresponding system of equations must be solved. The real solutions give the coordinates of the 
points of intersection. 

Example 1: The parabola y? = 12x and the circle (x + 3)? + »? = 72 intersect in the points 
ph ale since their coordinates satisfy a Aa A They are obtained from 
of equations 


iy ee 
y§ = 12x90), 
Fic has theisblenions x, = 3. pease ke y2 = —6. 


314 13. Analytic geometry of the plane 


Example 2: To determine the points 
of intersection of the _ ellipse 
(x + 6)?/80 + (y — 2)?/20 = 1 with the 
parabola (x + 6)? = 4(y — 2) one has 
to solve the system of equations (Fig.) 


(xo + 6)?/80 + (yo — 2)?/20 = 1 
(x9 + 6)? = 4(yo — 2) 


Under the transformation x, + 6 = é, 
Yo —2=7 the equations go into 20§ 

+ 80n7 = 1600 and &=4n. By 
eliminating § one obtains 807 + 807? 
= 1600, 77 ++ 1/4= 81/4, m2 = 
—1/2 + 9/2,m, = 4,n2 = —Sandé,,, 
= +4, 4 = +2 Si. Hence x, = §, 
—6=-—2, 1 =&—6=—10, y, 
= +2= 6, 2™= 71 +2=6. The 
conics therefore have two points of 3.5.20 Point of intersection and angle of intersection of an 
intersection S,(—2, 6) and S;(—10,6). ellipse and a parabola 


Two conics need not have any (real) points of intersection. They can also touch at one or two 
points. It can be proved that two non-degenerate conics intersect in at most four points. If one section 
of a (double) cone consists of two generators intersecting at the vertex and another section consists 
of one generator, then, for a suitable position of the sections, the two degenerate conics can have 
infinitely many points in common. 


The angle of intersection of two conics. The angle of intersection of two conics is defined as the 
angle between their tangents at the point of intersection. One must therefore find the equations of 
the tangents and calculate the angle of intersection. 


Example 1: The parabola y* = 12x and the circle (x + 3)? + »? = 72 intersect at the points 

5,(3, 6) and S2(3, —6). The tangent to the circle at S, has the equation y = —x + 9 and the 
tangent to the parabola is y = x + 3. Since the gradients are negative reciprocals of one another, 
the parabola and the circle cut at right angles at S,, and similarly at S>. 
_ Example 2: The ellipse (x + 6)?/80 + (y — 2)?/20 = 1 and the parabola (x + 6)? = 4(y — 2) 
intersect at the points $,(—2, 6) and S,(—10, 6). The equation of the tangent to the ellipse at S, is 
(x + 6) (x, + 6)/80 + (» — 2) (», — 2)/20=1 or »y = —(1/4)x + 11/2, and the equation of 
the tangent to the parabola is (x + 6) (x, + 6) = 2(y —2+ y», — 2) or y= 2x+ 10. Since 
tana, = —1/4, ~ moe and tan «a; = +2,«, = +63.43°, the angle of intersection is 
y= a,-—a, = //. . 


Common vertex equation of the conics 


The parameter of a conic. The parameter 2p of a parabola y? = 2px in the vertex position is 
defined as the length of the chord of the parabola perpendicular to the axis through the focus; it 
measures the width, so to speak, of the parabola at the focus. This definition can be carried over 
to the other conics. 


The parameter of a conic is defined as the length of the chord perpendicular to the principal axis 
through a focus. 


The parameter of a conic whose principal axis lies along the x-axis can be calculated by working 
out twice the positive ordinate y, at a focus, that is, by substituting the abscissa x, of the focus 
into the equation of the conic and solving this equation for y;: 
parabola: y? = 2px, xp = p/2, so yr = P, 
ellipse: x?/a? + y?/b? = 1, xp =, 
e7/a? + yz/b> = 1, so yr = +(b/a) Y[a* — e7), 
or, since a? — e? = b?, yp = b?/a, 
hyperbola: x?/a? — y?/b? = 1, 

Xp = e, e?/a? — yz/b? = 1, so yp = tle? — 27], 
or, since e2 — a? = b?, yp = b?/a. 
Vertex equations of the conics. The inner relationship between the conics is clear from their 


vertex equations. For the parabola this is y? = 2px, and for the ellipse and hyperbola it can be 
obtained from the central equation by a parallel displacement of the coordinate system. 


13.5. The conics 315 


Ellipse: From the central equation &?/a? + ?/b? = 1 in the &, y-system one obtains by trans- 
forming the origin to the vertex V,(—a,0), that is, by the transformation x = § + a, y= 7, 
the equation (x — a)?/a? + y?/b? =1 in the new system; this equation can be rearranged 
into y? = 2b?x/a — b?x?/a?, or, by using the semiparameter p = 6?/a of the ellipse, into 
y? = 2px — (p/a) x? (Fig.). The relation to the vertex equation of the parabola is obvious: from 
the term 2px for the parabola the term (p/a) x? is subtracted to obtain the ellipse. This explains the 
name ellipse: it refers to a.deficiency (Greek: elleipsis) compared with the parabola. 


13.5-21 Transformation of an ellipse into 
the vertex position 


13.5-22 Transformation of a hyperbola : fe : 
into the vertex position fied Be sal EES Sessa Coeeteeeal Coeseeeett aay 


Hyperbola: From the central equation &7/a? — 7?/b? = 1 in the &,7-system one obtains, by 
transforming the origin to the vertex V,(a, 0), that is, by the transformation x =  — a, y=, 
the equation (x + a)?/a2 — y?/b? =1 in the new system; this equation can be rearranged 
into y? = 2b?/a + b?x?/a”, or, by using the semiparameter p = 67/a of the hyperbola, into 
y? = 2px + (p/a) x? (Fig.). Compared with the parabola y? = 2px, there is a term (p/a) x? in 
excess of the term 2px. This explains the name hyperbola (Greek: hyperbole, the excess). 


Common vertex equation of the conics. By introducing the numerical eccentricity « = e/a for the 
ellipse (0 << « < 1) and the hyperbola (e > 1) and e = 1 for the parabola, all three conics can be 
given a common vertex equation. For the ellipse, p/a = b*/a = (a? — e”)/a? = 1 — e? > 0 since 
1 <e< 1; for the hyperbola, on the other hand, p/a = b?/a? = (e? — a*)/a” = e? — 1, and so 
0 — e? is always negative. For « = 1 the term (1 — e€7) x? obviously has the value zero; the equation 
y? = 2px — (1 — €*) x? therefore describes each of the three conics, depending on the value of «. 


es ee ye 
ian Tim | ‘ 


file e=15 


ae Ses = a 


The vertex equation of the circle is also includ- 
ed in this equation. If one puts p = r and e = 0, 
one obtains y? = 2rx — x? or y? = x(2r — x); 
this relation is satisfied, by virtue of the altitude | 
theorem for right-angled triangles. Me oO 

In the common vertex equation a conic is de- if 
termined by the parameter 2p and the numerical 
eccentricity «. The quantities used up to now to - b—o 
characterize a conic, the semiaxes a and 5 and the 

linear eccentricity e, can be expressed in terms of p 
and e if one considers that yp = 0 gives xp = 2a 
and that p = b?/a for the ellipse and hyperbola 4 
and p = r for the circle. One finds for the ellipse 
a = p/(1 — &?), b = p/V(1 — 7), e = pe/(1 — €”) \Z 


fe 
circie 
= a 
and for the hyperbola a= p/(e7— 1), b= NF Cre 
oe. 
o 


ellipse 


plV(e? — 1), e = pe/(e? — 1); if one chooses p = 1, Le 

then, for « = 0.8, for example, the rounded-off 

values are a = 2.78, b = 1.67, e = 2.22, while for 13.5-23 Dependence of a conic on the numerical 
e = 1.5, a = 0.8, b = 0.89, e = 1.2 (Fig.). eccentricity 


316 13. Analytic geometry of the plane 


Polar equations of the conics 


To describe the conic in polar coordinates it is natural to take its axis as the zero direction; for 
the ellipse and hyperbola one could choose the centre as pole, but it is more usual to take a focus 
as pole. 


Polar equations of conics referred to the centre as pole. The central equation of the ellipse x?/a? 
+ y?/b? = 1 is transformed to polar coordinates by putting x = rcosgy, y = rsing, where the 
pole is the centre of the ellipse, so the polar equation is: 

(r2/a?) cos? » + (r2/b?) sin? » = 1 
or 1 = (r?/b?) (b? cos? g + a? sin? y)/a2_ = (r?/b?) (b? cos? y + a? — a? cos? ¢)/a? 
= (r?/b?) [a2 — (a? — b?) cos? y)/a? = (r?/b?) [1 — (e€?/a?) cos? ¢] 
= (r?/b?) (1 — &? cos? 9), 
that is, r2 = b?/(1 — «? cos? 9). 


The polar equation of the hy- , . 
perbola can be obtained similarly. Ola F 


Polar equations of the conics referred to a focus as pole. These conic equations find many applications 
in astronomy, particularly because of Kepler’s first law, which states that the planets move in ellipses 
having the sun as one focus. One naturally uses as coordinates of planetary motion the distance 
from the sun and the angle in the orbit, and so one uses a polar coordinate system whose pole is 
one focus of the ellipse. At the same time, the numerical eccentricity ¢ is used in astronomy as a 
measure of the deviation of the elliptic orbit from a circular path. The word eccentricity is a happy 
choice: in a circle, the centre coincides with the centre of gravitation ; the longer the ellipse is stretched, 
the further is the centre from the centre of gravitation, and so the more eccentric is the path. KEPLER 
discovered the fact that the planets actually move in ellipses, not circles, by considering Mars which, 
of all the planets then known, has the greatest eccentricity, ¢ = 0.0933. The eccentricity of the orbit 
of the Earth is only ¢ = 0.0168. Also meteors, comets and artificial satellites, if they have periodic 
motion inside the solar system, move in elliptical orbits. If they are not periodic, that is, their kinetic 
energy is sufficient to take them outside the solar system, then they move in parabolas or hyperbolas, 
provided that one neglects the disturbance caused by the force of attraction of the planets. 


Polar equation of the ellipse. In Fig. 13.5-8 the focus F, is taken as the pole of a polar co- 
ordinate system, whose zero direction is that of the x-axis from F, to V,. In the triangle F, PF,, 
since r2 = 2a — r, and |F2F,| = 2e, the cosine law gives: 

(2a — r,)? = (2e)? + r? + 2: 2er, cosy or 4a? — 4ar, + r? = 4e? + r? + 4er, cosg, 
r, = (a? — e*)/(a + ecos¢) = a(1 — «”)/(1 + e cos gq) = b?/[a(1 + e cos y)] = p/(1 + Ecos), 
on putting «= e/a and b?/a=p. 

Polar equation of the hyperbola. In Fig. 13.5-13 the focus F, is taken as the pole of a polar 
coordinate system, whose zero direction is that of the —x-axis from F, to V,. In the triangle F, PF,, 
since r2 = 2a + r, and |F,F,| = 2e, the cosine law gives: 

(2a + r,)? = (2e)? + r? —2:2er; cosy or 4a? + 4ar, + r? = 4e? + r? — 4er, cos, 
r, = (e? — a?)/(a + e cos ¢) = b?/[a(1 + €cos y)] = p/(1 + Ecos@g). 
Polar equation of the parabola. In Fig. 13.5—4 the focus Fis taken as the pole of a polar coordinate 


system whose zero direction is that of the —x-axis from F to V. Since |LoF|= p, the definition 
of the parabola gives 


p—rcosy=r or r=pf(l1+cosg). 


All the conics therefore have equations of the same form r = p/(1 + € cos ¢) in a polar coordinate 
system whose zero direction goes from the pole to the nearest vertex; they differ in the values of the 
numerical eccentricity, which for an ellipse is positive but less than 1, for a hyperbola is greater 
than 1 and for a parabola is equal to 1. Also, the circle can be included by taking ¢ = 0, so that 
the radius vector has the constant value r = p. 


13.5. The conics 317 


For the parabola (« = 1), r is not defined when 9 = za. If e = Oor 0 < € < 1, that is, for the circle 
or the ellipse, to any value of the angle there corresponds a unique value of r. Finally, if ¢ > 1, 
r is not defined for any value gy, for which 1 + (e/a) cos gy = 0, that is, cos g = —a/e, when the 
free side of the angle gy, or —q, is parallel to an asymptote. 


Example: The perihelion is defined as the point of a planetary orbit nearest to the sun, and the 
aphelion the furthest point. What is the distance of the aphelion of Mars from the sun? — From 
astronomical observations it is known that the major semi-axis a of the orbit of Mars is, in round 
figures, 1.52 radii of the Earth’s orbit (1 radius of the Earth’s orbit is about 92.6 million ie 
and its eccentricity ¢ = 0.0933. At the aphelion g = z. Since p = b?/a = a(b?/a”) = a(a? — e?)/ 
= a(1 — e*), r= a(1 — e7)/(1 — e) = a(1 + £&) = 1.52 X 1.0933 = 1.66, measured in radii of 
the Earth’s orbit. This means that the distance of Mars from the sun at aphelion is about 
154 - 10° miles. | 
The eccentric anomaly. In astronomy and in the calculation of the elliptic paths or artificial satel- 
lites, the eccentric anomaly E, introduced by KEPLER, is used. This is the angle E measured from 
the zero direction to CP’, where C is the centre of the ellipse and P” is the point of the auxiliary 
circle that corresponds to a point P of the ellipse (Fig.). 
In plane geometry the construction of an ellipse is carried 
out from the auxiliary circle with radius a (half the major 
axis) and the concentric circle with radius b = (a? — e”) 
(half the minor axis). As was shown in the parametric re- 
presentation of the ellipse, all its chords perpendicular to 
the major axis V,V, are in the ratio 6: a to the correspond- 
ing chords of the auxiliary circle. If P is a point of the 
ellipse, the segments |P’P| and |P’P’’| are half these chords, 
so |P’P|: |P’P’”’| = 6: a. In the right-angled triangle shown in 
the figure, |P’P| = rsing, |P’P’’| =a sin E, and so ba sin E 
=arsing or rsing = bsin E. On the major axis, because 
|CF,| =e and |CP’| = acos E, one obtains r cos y = a cosE 
—e. By using the equations 1 = sin? gy + cos*@ and e? 
= q* — b*, r can be expressed as a function of E: 
r2 = b2 sin? E + (a? cos? E — 2ae cos E + e?) 

= p* sin® E + q? cos? E — 2ae cos E + a? 

= a2 eee ee E + a? = (a— ecos E)’, 13.5-24 Eccentric anomaly 
and since a > e andr >0,r=a—ecosE. 


This equation contains Kepler’s first law, according to which the planets move round the sun in 
elliptic orbits with the sun as one focus. 
The relation between the anomaly and the eccentric anomaly E is given by the two equations 
cos m = (1/r) (a cos E — e) = (acos E — e)/(a — ecos E), 
sin y = (1/r) b sin E = (a? — e?) sin E/(a — ecos E), 
which can also be expressed in the form tan (y/2) = [(a + e)/(a — e)] tan (E/2). To obtain the 


time ¢ as a function of E, one of these equations, the second, for example, is differentiated with 
respect to t, where, as usual, differentiation with respect to t is denoted by a dot: 


bcos E- E(a — ecos E)— esinE- Ebsin E 


cosy: ¢ = (a — ecos E)? 
_ 2F in2 a 
=p. £. 4084 ecos* E = E =b-E. acos E — e 
(a — ecos E) (a — ecos E)? 
dp | acos E —e a—ecosE ._ dy bE bE 
It follows that ¢ = 4 = bE- (a—ecosE)? acosE—e Ee dt °=~«6@se cose OC 


By Kepler’s second law, the area covered by the radius vector in a given time is constant: r?@ = C. 
Introducing C into the last relation gives 
dE r*@ C C b 


= be Op ae CO 


and so the required function t = t(E) is obtained by integration: 


(a — ecos E) dE, 


2_ 42 
t= (Ea —esin Ey =O) (fa — e sin B). 


318 13. Analytic geometry of the plane 


As E increases from 0 to 22, the orbital time T is obtained: 
_ »b __ 2na y(a? — e?) 
= Tol 22a = ay . 

By Kepler’s third law, for each planet there exists a constant u/(4n?) for which a?/T? = p/(427), 
or, by substituting the above value for 7, «4 = aC?2/(a? — e). Therefore three of the four constants 
are sufficient for all the relations; it is usual to choose e = e/a, C andy and to derive from 
r= p/(1 + ecosg) = b?/[a(1 + ecos¢)]: 

r= C?/[u(1 + ecosg)] = C2(1 — ecos E)/[u(1 — e?)], 
cos p = (—e+ cos E)/((1 — ecosE), sing = y(1 — e?) sin E/(1 — ecos E), 
t= C*(E — esin E)/[u2(1 — e7)3/2). 


Discussion of the general equation of the second degree 


The general equation of the second degree in two variables x and y has the form 
ax? + 2bxy + cy? + 2dx + 2ey + f=0, 
where a, 5, c, d, e, fare arbitrary real coefficients. It is actually of the second degree only when a, b, c 
are not all zero. This equation defines a curve in the x, y-coordinate system. Henceforth a rec- 
tangular coordinate system will always be assumed. The type of curve depends on the values of the 
coefficients. The discussion, by which one means the characterization of the curve depending on 
the coefficients, shows the validity of the following theorem. 


The general equation of the second degree always represents a conic. 


Elimination of the mixed term. By a rotation of the coordinate system, that is, by a transformation 
x=€cosa—ysin«a, y=€sinn+ cosa 
with a suitable angle «, one can always arrange that the mixed term with &7 vanishes. If the coef- 
ficients a and c of the squared terms are equal (a = c), one chooses « = 45°; if they are different, 
then « is chosen so that tan 2x = 2b/(a — c), as one.can see by substituting the equations of trans- 
formation in the original equation. In this way a transformed equation in & and 7 is obtained. It 
is convenient to write these variables as x and y again; the equation is then of the form 
Ax? + Cy? + 2Dx + 2Ey + F=0. 
This means that the axes of the conic are now parallel to the coordinate axes. 


Elimination of the linear terms. The central equations of the ellipse and hyperbola have no linear 
terms. One therefore looks for a parallel displacement x = € + c, y=%7+d of the coordinate 
system, where c and d are constants such that the linear terms vanish. By carrying out the trans- 
formation one obtains: 


A&? + Cn? + 2(Ac + D)& + 2(Cd + E)n + Ac? + Cd? + 2Dc + 2Ed+ F=0. 
Discussion: 
(1) If d+0 and C+0, both linear terms can be removed by choosing c= —D/A and 
d = —E/C. The equation then takes the form Aé? + Cn? = N, where N = D?/A + E2/C — F. 
Three cases are possible for N: N >0,N=0,N< 0. 
N>O: Casel: A and C both positive. The curve is an ellipse with the central equation 


e- n ; 

—— + —-— = 1], and with the semi-axes (N/A) and V(N/C). 
Case 2: A and C negative. There is no real curve. 

Case 3: A and C of opposite signs. The curve is a Ayperbola. 


N=0: Case 1: If A and C have the same sign, the equation is satisfied only for = 7 = 0. 
The curve is a single point. 
Case 2: A and C of opposite signs. The left-hand side of the equation then factorizes, 
and the curve is a pair of intersecting lines. 


N <0: The same conics are obtained as for N > 0 (except that Cases 1 and 2 are interchanged). 
(2) If AC =0, there are three possibilities. 


A=0,C+0: Case1: D+ 0. Then c and d can be chosen so that Cd + E=0, Cd? + 2Dc 
+ 2Ed+ F=0. The equation becomes 7? = —2(D/C)&, and the curve is a 
parabola. 


13.5. The conics 319 


Case 2: D = 0. The equation is a quadratic in 7, and it therefore represents a 
pair of parallel lines. They coincide, and therefore represent a double line, if 
E? — FC= 0. 

A+0,C=0: Case 1: E + 0. The curve is a parabola. 

| : Case 2: E = 0. The curve is a pair of parallel lines or a double line. 


A=0.C=0: Case 1: Dand E not both zero. The curve is a single line. 
- Case 2: D = E = 0. Then F must also be zero. 
Note: The case of a pair of parallel lines can be regarded as a special case of a section of a cone 
by a plane parallel to the axis, where the vertex of the cone is at infinity, and the cone is therefore 
a cylinder. 


Example 1: In the equation 3x? — 30x + 8y + 65 = 0, the values of the coefficients are 
A=3+0, C=0, D+ 0; the curve is therefore a parabola. To find the vertex, focus and para- 
meter, one divides by 3 and completes the square; from the equation x? — 10x + 25 = —(8/3) y 
— 65/3 + 75/3 or (x — 5)? = —(8/3) (y — 5/4) one sees that the parabola is concave downwards, 
that the vertex is at V(5, 5/4) and that the parameter is p = 4/3. 

Example 2: The equation 25x? + 49y? + 150x — 196y — 804 = 0 describes an ellipse, whose 
principal axis is parallel to the x-axis, since A = 25 + 0, C = 49 + 0, N > 0. The equation can 
be brought to the central form by twice completing the square: (x + 3)7/49 + (y — 2) [25 = 1. 
The centre of the ellipse is at C(—3, 2). The major semi-axis is a = 7, and the minor semi-axis 
is b= 5, 

Example 3; The equation 64x? — 25y? + 256x + 300y — 2244 = 0 represents a hyperbola, 
as one sees Satta, From the equation it follows that 64(x* + 4x) — 25( y? — 12y) = 2244, 
By completing the square twice, one finds that 64(x* + 4x + 4) — 25(y? — 12y + 36) = 2244 
+256—900 or 64x + 2)? — 25(y —6 2—= 1600. The central equation 1s therefore 
(x + 2)2/25 — (y — 6)?/64 = 1. The major axis is parallel to the x-axis, the centre is at C(—2, 6) 
and the semi-axes are 5 and 8. ns 

Example 4: For the conic 9x? — 4y? = 0, AC + 0 but N = 0. Since AC < 0, the conic Is a 
pair of intersecting lines. In fact, 9x? — 4y? = (3x — 2y) (3x + 2y) = 0. Each factor gives a line, 
and the equations of the lines are y = (3/2) x and y = —(3/2) x. The two lines intersect at the origin. 


II. Steps towards higher mathematics 


14. Set theory 


14.1. The concept of a set ........... 320 ~—s 114.5. Infinite sets and cardinal numbers 326 
14.2. Operations on sets ............ 322 = 14.6. Well-ordered sets and ordinal 

14.3. Relations .............22 ccc eee 323 NUMDETS 49s5-cnbuee tea es ds edea 329 
14.4. MappingS ..........0c eee eees 325 


Set theory is the foundation stone of the edifice of modern mathematics. The precise definitions 
of all mathematical concepts are based on set theory. Furthermore, the methods of mathematical 
deduction are characterized by a combination of logical and set-theoretical arguments. To put it 
briefly, the language of set theory is the common idiom spoken and understood by mathematicians 
the world over. From all this it follows that if one is to make any progress in higher mathematics 
itself or in its practical applications, one has to become familiar with the basic concepts and results 
of set theory and with the language in which they are expressed. 

The definition of a set quoted below gives the impression that the naive set concept is easy to 
grasp because of its apparent perspicuity. In actual fact it leads to great difficulties, which have 
only been overcome by the development of axiomatic systems for set theory. 

When Georg CANTOR (1845-1918), who founded set theory, published his daring new concepts 
and arguments, their importance was recognized by only a few mathematicians. But in its further 
development the theory was to penetrate almost all branches of mathematics, having a profound 
influence on their development, and changing the appearance even of established theories. Indeed, 
the development of some disciplines, such as topology, was essentially dependent on the means of 
set theory. What is more, set theory proved a unifying force, giving all branches of mathematics a 
common basis, and their concepts a new clarity and precision. 

The following sections emphasize those parts of set theory that have particularly important 
applications in the development of the various branches of mathematics. 


14.1. The concept of a set 


In colloquial usage the term ‘set’ is taken, as a rule, to mean a collection of things that in some 
sense or another belong together or are akin. This latter aspect is difficult to make precise and is 
therefore omitted from the mathematical siete i 


In spite of the lack of precision of this definition — which actually leads to sonteadicuons (see 
Example 5) — it is sufficient to introduce several important definitions and concepts. 


A subset T of a set S is any set whose elements all belong to S; this is denoted by TC S. 
The subsets 7 of S that are distinct from S itself are called proper subsets of S; in this case one 
writes TC S. The empty set is a set that has no elements at all. The introduction of this set has 
proved convenient to round off statements and arguments of set theory, just as the number 0 
(historically a late invention) rounds off the statements and calculations of arithmetic. The usual 
symbol for the empty set is @. 


14.1. The concept of a set 321 


Sets whose elements are themselves sets are called families or systems, for instance, a nation is 
a set of people and an element of the ‘family’ of nations. A very important system is the set of all 
subsets of a given set S; this is called the power set of S and is denoted by P(S). 


Example |: The set S of all people in a certain building B at a certain time ¢. This set is well- 
defined even if nobody is in the building at the chosen time; in that case S is the empty set. The set 
A of all women in B at the time ¢ is a subset of S, WC S; W is not necessarily a proper subset 
of S. 

Example 2: The set of all prime numbers. This set is infinite, as was already proved by Euc ip, 
whereas the sets of Example | are always finite. 
aes 3: The set of all regular polygons inscribed in the unit circle is used in the calculation 

of x. 

Example 4: The set of all subsets of the natural numbers. This is also infinite, in fact, as will 
be shown later, ‘more’ infinite than the natural numbers themselves. 

Example 5: The set of all sets that do not contain themselves as elements. This set, which is 
perfectly admissible under Cantor’s definition, leads to the celebrated paradox of Bertrand RussELL 
(1872-1970). If the set is denoted by R and if one supposes that R is an element of itself (R € R), 
then R is — like every other element of R — a set that does not contain itself as an element (R ¢ R); 
that is, the assumption leads to a contradiction. If, on the other hand, 2 is not an element of itself 
(R € R), then, since R contains every set that does not contain itself, R cannot be one of those 
sets. Therefore R contains itself (R€R), which is again a contradiction. Since one of the two 
assumptions must be true, the whole situation stands in contradiction to the laws of logic. 
Example 5 shows that the construction of new sets must not be extended without bounds, if 

contradictions are to be avoided. 

The examples make it clear how sets are to be constructed. A set is determined by the description 
of a property. To be a little more precise, the set consists of all objects £ for which a statement 
A(x) with the object variable x becomes true if x is replaced by &. 

In Examples 1 to 4 these statements are (in the same order): 


x is a human being and is in the building B at the time ¢, 
x is a prime number, 

x is a regular polygon inscribed in the unit circle, 

x is a subset of the set of positive natural numbers. 


If x is replaced by an arbitrary object, a statement results that is either true or false. In axiomatic 
systems of set theory it is of central importance to delineate precisely what logical form a statement 


The axiomatic systems of set theory developed in the first half of this century all have four basic 
principles in common: the principles of extensionality, of set construction, the existence of infinite 
sets, and the axiom of choice. 

The principle of extensionality says that two sets S and T having the same elements (that is, being 
of the same extent) are identical (S = T). The word identical is taken here in Leibniz’ sense, that 
is, in any statement S can be replaced by T and vice versa, without changing the truth or falsity of 
the statement. 

The principle of construction asserts that certain restricted types of statements do define sets; 
a usual restriction is that the statement contains only object symbols, logical symbols, and the 
symbol e. 

The existence of infinite sets states just that. The meaning of infinite must, of course, be made 
precise. This principle is difficult to motivate by a direct reference to reality. But without it major 
parts of mathematics and theoretical science, such as the differential and integral calculus and 
classical mechanics, would become meaningless. One could not even give a set-theoretical foundation 
to the theory of natural numbers. 

Finally, there is the axiom of choice, which is basic for many mathematical arguments. Nevertheless 
many authors regard this axiom with a doubt similar to that which Euclid’s parallel postulate met 
in an earlier era. 


322 14. Set theory 


14.2. Operations on sets 


Operations on sets are used to construct new 
sets from given ones. The most important are 
the intersection, union, and difference of sets S 
and T. 


Example 1: {a, b, c} ~ {a, c,d} = {a,c}, {a,b,c} + {a, c,d} = {a, b, c,d}, 
{a, b, c} \ {a, c, d} = {b}. 
oe 2: The intersection of the set of all rectangles and the set of all rhombi is the set of 
all squares 
Example 3: The union of the set of all rectangles and the set of all parallelograms is the set 
of all parallelograms, because every rectangle is a parallelogram, and so nothing is added to the 
set of all parallelograms. 

It is important to distinguish between the union S ~ T and the set of all elements belonging 
either to S.or to T (but not to both). This latter set is called the symmetric difference of S and T 
and is only used occasionally for special purposes. 

Sets S and 7 whose intersection is empty are called disjoint. If S is any subset of U, then U\S 
is called the complement of S in U. 

The basic properties of the operations on sets in the following table can be illustrated by repre- 
senting the sets by bounded areas of the plane. 


Intersection: S -— T =ge, {x |x eS and xe eT} 
Union: Sv T = ge, {x | xe S or xE T) 
Difference: S \ T =ger {x | x € S and x ¢ T} 


Commutativity Associativity 

SaT=TnASs Sa(TaR)=(SAT)AR 

SuT=Tu § Sv (Te R)=—=(SYeT)eR 

Distributivity Idempotence 

Sa(Tv R)=(SAT)v (So R) SaS=S a(TuR)= 5 (To) 


If S and T are subsets of U and their complements in 14.2-1 Distributivity of ~ and v: 
U are written S’ and 7” for short, then De Morgan's | 
rules hold: (Sa TY =S +T’'; (Sv TY =S' nT’. 


As an example, here is a proof of the first statement 
(Fig.). To show that (S ~ T)’ = S’ © T’ one proves the two 
statements (() (S 7 TY © S’u T’and (ii) (SaTY >S’cT’. 

To prove (i), let x €(ST)’, that is, xe U, but x ¢ S-4 T. 
Now either xe S or x ¢ S. If the latter, then x ¢ S’ and 
therefore x € S’ v T’. If the former, then x ¢ 7, for other- 14.2-2. De Morgan's rule (SAT) = 


wise x would be an element of S ~ T. S’O T’; S’ is blue, T’ is red, S>T is 


left white 


Therefore x € TJ’ and again x € S’ v T’. This completes the proof of (i). To prove (ii), let x Ee S’ UT’, 
that is, x € S’ or x ET’ (it is, of course, possible that both hold). In the first case x ¢ S and therefore 
x¢ ST, and in the second case x ¢ T and again x ¢ S 4 T. 

Generalized operations on sets. The operations of intersection and union are initially defined as 
operations on two arguments. They can, however, be generalized not only to 3, 4, ... sets, but to 
arbitrary systems of sets. But first a few explanations. ‘ 

Systems of sets are denoted in what follows by upper case bold face letters. The members S, 7, ... 
of a system S are sometimes labelled by subscripts or are made to depend on parameters. For 
roar yee a finite system S may be written as (S,,..., S,} or S= {S,;|i=1,..., k}. Frequently 

= {5}, <x is used. Thus, if 4, = {xe N |x <n}, then {An},cn is the family of all initial seg- 
ments of the sequence of natural numbers. 

In general, {S;},;¢, is called an indexed family of sets if a set I, the index set, is given and if to 
every ic Ja set S; of the family is assigned. Every set of the family must occur at least once, but it 
is not required that distinct indices give distinct sets. In the terminology of mappings (see 14.4.), 
an indexed family S’ is a surjective mapping of J onto S; here S itself is the range of the mapping. 
Every family S can be indexed by taking S itself as the index set. 


| Definition of the intersection and union of an arbitrary system S: 
NS =ger {x | x € S for all SES}; US: = ser {x | x € S for some Se S}. 


Tf Sis indexed, sS= dren’ then one writes ( S = Aa oe ak) c= iY Siz 


14.3. Relations 323 


These definitions also include the original case of two sets, when the system has only two members. 


Generalizations of the distributive laws: S ~ i, Ss, = Y, SAS; Sv¥NnS= Aa S~ S;. 
ier er 
If all the sets are subsets of a set U/, the tclhsering sa sitialigce of De Nicer s rules hold: 


INST=US; (USS = 5. 
ier ie i€l ier 


14.3. Relations 


It is well known that if a and 6b are distinct real numbers, then a < b or b< a. If a< b, one 
could also say that the relation ‘less than’ holds for the pair (a, b). This relation can be denoted 
by R< and is completely characterized by the set of all ordered pairs of real numbers for which it 
holds. An extension of this train of thought leads to the following basic definition. 


In the aeanition above the term Gemereay pair’ is used in a naive, intuitive sense as an aggregation 
of the objects a and 5 in such a way that a is distinguished as the first element of the ordered pair 
(a, b), and b as the second. In the further course of this section a rigorous set-theoretic definition of 
the notion of ordered pair will be given. 


Example 1: In the set S of all humans living at the present moment a relation C can be defined 
by ‘A is a parent of B’ or ‘B is a child of A’. 

Example 2: The relation R, on the set § = {1, 2, 3, 4, 6, 12}, defined by 
the statement ‘x divides y’ consists of the pairs (1, 1), (1, 2), ..., (2, 2), 
(2, 4) and so on. In the Figure 14.3-1 the relation is represented by an 
arrow diagram in which the numbers, represented by dots, are connected 
by arrows if the relation holds between them. Since every number 
divides itself, every point is connected to itself by a circular arrow (loop). 


14.3-1 Arrow diagram for the divisibility relation on {1, 2, 3, 4, 6, 12) 


{xe S| (x,y) €R for at least one y in S} is. 
wes | Qa) € R for at least one y€ S} le called thes iwe of 


in what follows by Supp R and Ran R. The 
icaliad the domme ok 3 Obvious Dow R& S. 


The support of C in Example 1, for instance, consists s of all humans with¢ at Teast x one child, the 
range consists of all humans one of whose parents is still alive. In Example 2 all elements belong 
to both the support and the range. 

In mathematics certain properties of relations play a particular role; some of the most important 
are Peery in the following table (where R denotes a relation c on S$). 


=aer XRx holds for all x€ S. 
der erat (ok ae tae 
ric Dee tea eleme nts x, y € S with xRy and yRx. | 
trl tel Perk epee then x = y. 
=ger for all x, y, zE S: if xRy and yRz, then xRz. 
=¢er for all x, y ES: if x + y, then xRy or yRx. 
=vger for all x, y, z€ S: if xRz and yRz, then x = y. 
Ppbeiepenata =aer for all x, y, z€ S: if xRy and xRz, then y = z. 
Ris biunique ger R is left unique and right unique. 


Restrictions of relations. If R is a relation on S and 7 a subset of S, then ae ye Rix, yET} is 
a relation on T. It is called the restriction of R to T and is frequently denoted by R, 7. For instance, 
the relation ‘less than’ on the natural numbers N is the restriction toN of the relation ‘less than’ 
on the real numbers R. 


Equivalence relations. An equivalence relation on a set S is a relation that is reflexive, symmetric, 
transitive, and has support S. Equivalence relations are found not only in every corner of mathematics, 
but in almost all the sciences. 


324 14, Set theory 


Example 3: A line / is parallel to a line I’: 1 || I’. 
Example 4: A number a is congruent to a number 6 modulo m: a= 6b (mod m). 


Example 5: A triangle ABC is similar to a triangle A’B’C’: /\ABC ~ /\A’B’C’ or a figure F 
is homeomorphic to a figure F’ (see Chapter 34.). 
Example 6: x is identical with y. The identity relation on S, id, is the set {(x, x) | x € S}. 


An equivalence relation R on S induces a partition of S into classes, which consist of those elements 
between which the relation holds. 


A partition of a set S is a family P of non-empty subsets of S, called the 


classes of the partition, with the following two properties: 1. Any two dis- 
tinct classes are disjoint, 2. every element of S lies in one class (Fig.). 


If P is a partition, then every element a of S lies in exactly one class C € P, os 
which is denoted by C,. Obviously C, = C, if and only if 6 lies in C,. 14.3-2 Partition of a 
The following simple theorem is of fundamental importance. It is the S¢t 5 nto three classes 
basis of the principle of identification by abstraction. 


Main theorem on equivalence relations. If R is an equivalence relation on a set S, then there exists 
a partition P of S such that elements a, 5 € S lie in the same class of P if and only if aRb holds. 


Conversely, if P is a partition of S, then the relation {(a, 5)| there is a class Ce P with a, be C} 
is an equivalence relation. 


Proof. Let R be given. Define C, =ger {x € S | aRx}, and call it the equivalence class of a. Let P 
be the family of equivalence classes of elements of S. Since aRa for all elements of S, a€ C,. Thus, 
every element of S lies in a class of P. It remains to show that distinct classes of P are disjoint. 
Suppose that C, and C, are not disjoint, say ce C, ~ C,, then aRc and bRc. Since R is symmetric, 
this implies cRb, and aRb by the transitivity of R. Now if e¢ C,, then bRe and again by transitivity 
aRe. So e€C, and C, © C,. In the same way one shows that C, € C,, and therefore C, = C,. 
Thus, non-disjoint classes are identical, and P is the required partition. 

On the other hand, let P be a partition of S and let R be defined as in the statement of the theorem. 
R is obviously reflexive and symmetric. Suppose that aRb and bRc, then by the definition there 
exist classes C, C’ of P with a, b& Cand b, ce C’. These classes are not disjoint, because be C 4 C’, 
hence they are identical. But now a, c € Cand so aRc by the definition of R. Therefore R is transitive. 
This completes the proof. 

The equivalence classes of Example 3 are the directions in the plane or space. Those of Example 4 
are the residue classes mod m and those of Example 6 are the singletons in S. 


lations. A relation R ona set S is called a partial ordering on Sif Ris reflexive, transitive 


and vine If R is also connected, it is called a total or linear ordering. 


Example 7: The divisibility relation Ry is a partial ordering of the natural numbers. 

Example 8: The relation ‘S is a subset of T” is a partial ordering of the subsets of a setU. 

Example 9: The relation a < 5b, “a is less than or equal to 5’, is a partial ordering, in fact, a 
total ordering of the set of real numbers. 


An ordered set is defined as a pair (S, R), where Risa partial ordering of the set S. It is usual | 
| to let the symbol S also stand for the ordered set (S, R). The restriction R,;y of R to a subset T 
of S is again an ordering, in other words, subsets of ordered sets are also ordered. If S is an ordered 
set with a partial ordering R—, then an element we S is called an upper bound for a subset T 
of S, if xR<u holds for all x € T. An element me 5S is called maximal in S if there is no x + m 
in S with mRex. 


One of the most frequently used feminist in thé shale of mathematics is the following, which iS 
equivalent to the axiom of choice. 


Kuratowski-Zorn lemma. If every totally ordered subset of an ordered set (5, RX) has an upper 
bound in S, then S has a maximal element. 


An important example of the use of this lemma occurs in the section on cardinal numbers. 


Set-theoretical definition of an ordered pair. Since a is distinguished as the left-hand member of 
the ordered pair (a, b), and b as the right-hand one, the pair cannot simply be defined as {a, 5}. 
This difficulty is overcome by the following subtle definition. 


| Definition of an ordered pair (a, b) = def {{a}, {a, b}} 


14.4, Mappings 325 


If a + 5, one can distinguish the left-hand element of the ordered pair (a, b) as the element of 
the singleton of the set, while the right-hand element is that which is not in the singleton. 
From this definition one can derive the following fundamental property of ordered pairs: 


The statement (a,, a2) = (6, dia holds if and only if a; = 6, and a, = aba 


pa: be T. emp ocnng a 


elements of S" are called er aecents ors. ia Bren raphtlerne a ee: ee oreviated 
| (a, b, c) and so forth. 


Example 1]: The set C of re numbers can be regarded as the product R xX R = R? 
of the set of all real numbers with itself. 


A eae of tela is oles an targument AG —e pehation, be anes If Ss Lis defined oe then en | 


Examples 12: The relation ‘the point Y lies between the points Y and Z’ is a three-argument 
relation on the points of the plane. 

Example 13: ‘z is the sum of x and y’ is a three-argument relation on the set of natural num- 
bers, and also on other sets of numbers. 

Example 14: ‘The quadruple of points [O, P,Q, R] forms a parallelogram in the plane’ is a 
four-argument relation on the set of points in the plane. 


14.4. Mappings 


| A function o1 ona “set S with values in a set Tis a right unique relation with support in S and 


range in T. If the support is the whole of S, it is called a mapping from S into T. If the range of a 
- mapping is the whole of T, it is called a mapping of S onto T, or surjective. 


Remark. Obviously functions on S with values in 7 are particular subsets of S x 7. In some 
branches of mathematics (for example, complex analysis) functions are not necessarily assumed to 
be right unique. The definition given here corresponds to common usage in mathematics. 

Functions and mappings pervade the whole of mathematics. Most frequently they are mappings 
from one set S into another 7, which are also written F: S > T. The set of all mappings of S into T 
is denoted 7°. 

Since one frequently has to deal with several different types of mappings simultaneously, for 
instance, with mappings whose arguments are functions or mappings (see Example 3), there are 
a number of synonyms for mappings, or for particular types of mappings. The most frequent are 
operation (principally for mapping of S? into S), operator, functional (mainly for real-valued functions 
of functions), functor, and morphism (mainly for mappings that are in some sense structure pre- 
illo 


Image and inverse image. If F is a function on S with values in T, and if (x, y) € F, then y is i 
ciel ue ata cheunte F or the value of Fat x. This can be denoted in severé ways: y= x", | 


y = xF, y = F(x), or y = Fy. If y = F(x), then x is called an inverse image of y under F. The set | | 
F-"(y) =ges (x € S| F(X) = 5} is the complete inverse image of y. 


Particular functions. Functions on the set R of real numbers with values also in R are called 
real functions, or functions of a real variable. Functions of m real variables are functions on R"® 
with values in R. Mappings of the set N of natural numbers into itself are called arithmetical, or 
number-theoretical functions. 


Example 1; y = x? is a real function; the notation, though common, can easily give rise to 
misunderstandings, a better notation would be F: x + x?. For the moment the function will be 
called Sq. The support of Sq is obviously the whole of R, its range is R™°, the set of non-negative 
real numbers. 

Mappings of the set {0, ..., 7 — 1} into a set S are called n-term sequences of elements of S. If 
F(i) = a; (i=0,...,n — 1), then F is written as F = (ao, ...,@,_1). Mappings of N, the set of 
natural numbers, into S are simply called sequences of elements of S. The sequence F with F(i) = a; 
is written as (a;, a2, ..-) OF (@;)ien . 


326 14. Set theory 


Restrictions of mappings. If Fis a mapping of S into T, and if Uis a subset of S, then {(x, y)e F| x€ U} 
is a mapping of U into T. It is called the restriction of F to U and denoted F,y. For instance, the 
operation of addition of natural numbers is the restriction of the operation with the same name on 
real numbers. As the example shows, restrictions of mappings are frequently denoted by the same 
symbol ; as the unre tieied mapping. 


Spee a Pr isa = baleetive mapping of S onto itself it is ‘cian to its inverse. 


Example 3: S‘°") is the set of all mappings of {0, 1} into S, that is, the set of all two-term 
sequences of elements of S: {(ao, 41) | do, 4, € S}. Let F: M‘°") + M? be the mapping that 
associates with the sequence (ay, a,) the ordered pair (a9, @;). It is clear that F is bijective. Because 
of this there is no essential difference between n-term sequénces of elements of S and n-tuples of 
elements of S. . 

Example 4: Let S be a family of sets and A a set having exactly one element in common with 
each member of the family. Associate with each member S of the family the unique element of 
S - A, The mapping ¢ from S§ onto A thus defined is called a choice function for S. Choice func- 
tions are invertible only in special cases. 

Combination of mappings. If F and G are functions, then the set 

HT = ger {(x, Z) | there exists a y with (x, y) € F and (y, z)€G} 
is again a function, whose support is contained in the support of F and whose range is contain- 
ed in the range of G. This H is called the combination, composition, or product‘of F and G. If the 
images under the functions are written as F(x) etc., then it is denoted by G- F, because z = H(x) 
means that z = G(F(x)). If the notation y = x? and z = y® is used, then the product z = (xF)& 
== xF is written in the inverse order. Care must be taken to ensure that it is clear which nota- 
tion is being used. 

If F is a mapping from S to U, and G a mapping from U to 7, then H = G: F (where the order 
is taken as in the preceding paragraph) is a mapping from S to T. Here F- G need not even contain 
a single ordered pair. The product of mappings is associative, that is, F-(G- H) = (F- G)- H for 
any three mappings F, G and H. 


Example 5: Parallel shifts are particular (bijective) mappings of the set of points of the plane 
onto itself. In this case the composition of two parallel shifts p and qg is written as a sum p + gq. 
The operation +- is commutative, but in general, the composition of functions is not commutative. 


14.5. Infinite sets and cardinal numbers 


Definitions of finiteness. In the naive sense a set S is finite if there is a natural number 7 such 
that the elements of S can be counted out by the numbers before n; more precisely, if there is a 
bijective mapping from the set of natural numbers less than 7 onto S. 

This definition has the disadvantage that it takes the natural numbers as already given. On the 
other hand, it turns out that to define the natural numbers one needs the concept of a finite set. 
This difficulty was first clearly recognized by DEDEKIND. He overcame it by giving a definition of 
finiteness that avoids the use of the natural numbers and uses mappings instead. 


Dedekind’s definition of finiteness. “A set S is finite if ey Tike mapping of S into itself 
is bijective. 


It follows from this definition that S is infinite if and only if there exists an injective mapping of 
S into itself that is not surjective, in other words, if there exists a bijective mapping of S onto a 
proper subset of S. 


A foe ashore convenient definition, perhaps the most frequently used, is due to RUSSELL. 
| I's definition of finiteness. A set S is finite if it belongs to every system S with the following 


1. eS; 2. if Ue S, then U ~ {a} eS for allaeS. 


It is easy to show that a set that is finite in the sense of Russell’s definition i iS also finite in Dede- 
kind’s sense. The converse can be shown using the axiom of choice. 


14.5. Infinite sets and cardinal numbers 327 


Example 1: The set N of natural numbers is infinite, because there is an 
injective map of N onto a proper subset of N, for instance onto the set of 
even numbers (Fig.). An equally suitable mapping is F: n> n + 1. 


= 
© oe hu 
Dh © _i Go 


— 


Z 


14.5-1 Bijective 
mapping of the set 
of natural numbers 
N onto a proper 
subset 

on any suitable family of sets. Hence it leads to a partition of the family into (after Gatite!) 
classes of equipotent sets. 


iam 


. i mt, sila cilateeatlins 3 = th a rer i fee = p> il 


One cannot operate with the family of all sets, or even the family of all sets equipotent to a given 
set, because that would lead to Russell’s paradox. To avoid this one usually restricts the definition 
above to a family F that is as large as possible and as necessary. In that case the cardinal numbers 
are themselves sets, namely families of sets. It may, however, become necessary to enlarge the family F 
later. 


Comparison of cardinal numbers. The power or cardinal number of a set S is denoted by card S. 
Cardinal numbers are denoted by lower case bold type letters, s, ¢, etc. 


a ee 
r i a F 
7 J ‘| = 


This definition is independent of the choice of S. 


Bernstein's theorem. If there are injective maps from § into T, and from T into S, then S and T 
are equipotent. 


This theorem implies that the relation < on cardinal numbers is anti-symmetric. 


Theorem. The relation = on cardinal numbers is an ordering. 


At the end of this section it will be shown that any two cardinal numbers can be compared, that 
is, that < is a total ordering. In 14.6. it will even be shown that any non-empty set of cardinal 
numbers contains a smallest element. 


The existence of arbitrarily large cardinal numbers. The following theorem of CANTOR is basic 
for the theory of transfinite cardinals. 


Cantor’s theorem. To any set there exists a set of higher power, indeed, card P(S) > card S. 


The proof of this theorem is surprisingly short and elegant. On the one hand, it is clear that 
there is an injective mapping of S into P(S), namely the one taking the elements a € S to the sing- 
letons {a} in P(S). It is now necessary to show that no injective map from S to P(S) is surjective, 
in other words, that for every injective map y from S into P(S) there exist elements of P(S) that have 
no inverse image. This is done by showing that the set U = ger {x € S | x € y(x)} is never an image 
under gy. Suppose the contrary: that U = g(u), say, for some ue S. Now either ue U or u€ U. 
If we U, then we g(u) = U, since U= ¢(u); but, by definition, U contains only those elements 
of S that are not elements of their images under @. So this assumption leads to a contradiction. 
But the other assumption also leads to a contradiction, for u¢ U means that u ¢ g(u), and since U 
contains all elements of S that are.not elements of their images, this implies that ue U. Hence the 
original assumption is untenable (compare this proof with Russell’s paradox. Here an assump- 
tion was made, which is shown to be untenable because it leads to a contradiction; in Russell’s 
paradox the argument, applied to the set of all sets, is the same, but there is no previous as- 
sumption. Hence it leads to an undissolved paradox). 


328 14. Set theory 


The smallest transfinite cardinal is x. 


Proof. Example 1 shows that Xo is transfinite. It remains to show that %o < a for all transfinite 
cardinals, that is, that every transfinite set contains a countable subset. So let S be infinite and let 
g be an injective map of S into a proper subset T of S. Choose a € S \ Tand put a = do, and define 
Qna1 = P(An) inductively. The set {a, | i€ N} is a countable subset of S. 

In the usual naive proof of this theorem one chooses an element do of S, and then an element 
a, of S,; = S \ {ao} and continues in this manner. This process does not break off because the fact 
that S is infinite implies that the sets S; are not empty. This argument is a tacit application of the 


axiom of choice, but it can be made completely precise. 
The union of countably many countable sets is countable. 


Let M,= {aio, 4:1, 4;2,---} and suppose that the elements 
of all the sets M; are arrayed in an infinite matrix (Fig.). The 
counting can begin in the top left-hand corner and continue 
diagonally in the manner shown by the arrows. Elements that 
have been counted once are omitted on repetitions, for in- 
stance, if a, 7 = Axo, then a,; is omitted. It is clear that in 


this manner U. M, is completely counted out. 


This proof \ er used by CANTOR to show that the set of ratio- 
nal numbers is countable. One can imagine them arrayed in the 
same manner as the elements a,, in an infinite matrix. 


Sums and products of cardinal numbers. If S and 7 are dis- 
joint representatives of the cardinals s and f, then the sum of 
s and ¢ is defined as s + t =ger card (S v T), and the product 
as S$: t ger card (S X JT). These definitions can be extended 
to errant many: cardinal numbers. 


Qag— Ip; Ipg2 —-~ dos 
ce eo ae oe 

Qy9 On Or ny 

ie” Bait 

Ox ax, Qz2 oF) 
a 

cc a Q;, ay, 


14.5-2 Matrix array to prove 
that the union of countably many 
countable sets M,; = {@ie, @ii, ---} 
is countable 


To define the product, and “finally powers, of cardinal numbers one needs a generalization of the 
Cartesian product to arbitrary systems of sets {S; | i¢ J}. This generalization is also useful in other 


parts of mathematics. 


It can be shown that for transfinite cardinals m + n = m:n = max (m, n). 
In particular, m-+ Xo = m-:N%o =m for any transfinite cardinal m 
This means that the ordinary arithmetical operations become trivial 


when extended to transfinite cardinals. But this is not true for powers; 


for instance, the following theorem shows that m < 2™. 
If a set M has the cardinal m, then P(M) has the cardinal 2”. 


To prove this one associates with 1 if @aeT 


each subset T of M its characteristic (a) = 
function % = X71. which is defined by 


0 if aeM\T. 


This gives a bijective mapping of P(M) onto the set {0, 1}” of all 
mappings from M to {0, 1}. But the set of these mappings has the car- 
dinal 2”, by definition. 

The continuum. The cardinal of the set of real numbers is called the 
cardinal of the continuum and denoted by x or c. The cardinal of the 
set of real numbers in the open interval (0, 1) is also x, because this in- 
terval is mapped bijectively to the set of all real numbers, for example 
by the function y = (x — 1/2)/{[x(1 — x)] (Fig.). 


14.5-3 Mapping of the interval (0, 1) onto the whole real line by the funct- 
ion y = (x — 1/2)/[xU — x)] = 1/0 — x) — 1/[2x01 — x)] 


0 


14.6. Well-ordered sets and ordinal numbers 329 


The cardinals xo and x are connected by the formula x = 2°°, 

This is proved by defining two mappings. The first is an injective mapping from P(N) into R, 
the set of real numbers: if 14 € P(N), then M is mapped to the decimal 0. a,a, .... where a; = 1 if 
ie M and a, = 0 otherwise. This proves that 2%* < x. The second maps the set of real numbers 
in (0, 1) injectively into P(N): let r= 0. aa... (0< a; < 9), and exclude periods of the digit 9. 
Then map r to the set {la,, la,a2, ...} of natural numbers, for example r= 0.1406..., is mapped to 
the set {11, 114, 1140, 11406, ...}. This proves that x < 2®* and hence that x = 2™°. 

The continuum hypothesis states that there is no cardinal number between Xo and x, in other 
words, that an infinite set of real numbers is either countable or has the cardinal x. In 1964 CoHEN 
proved that it is impossible to prove the continuum hypothesis by means of the standard set-theo- 
retical axioms; previously, in 1938, GOpDEL had shown that the continuum hypothesis does not 
contradict these axioms. Together the two results show that the continuum hypothesis is independent 
of the other set-theoretical axioms. 


Comparison of cardinals. 


For any two cardinal numbers m + n one of the relations m < nm or n > m holds. 

It is sufficient to prove that for any pair of sets M and N there is an injective function g on M 
with values in N such that the support of g is M or the range of g is N. For in the first case 
card M < card N and in the second card N < card M. The proof given here exhibits a use of the 
Zorn-Kuratowski lemma, which is typical for modern mathematics. 

Let ® be the set of injective functions on M with values in N. This set is not empty, because the 
empty function %o with Supp 9 = Ran Yo = @ is contained in ®. For elements 9, y of ® one defines 
yo <y if » is a restriction of y, or equivalently, if ¢ < y, when these mappings are regarded as 
sets of ordered pairs. It is clear that < is an ordering of ®. Now if 2 is a chain (totally ordered 
subset) of ®, then |) {2 (with the functions again regarded as sets of pairs) is an injective function 
on M with values in N, and thus an upper bound of 2 in ®. Now by Zorn’s lemma @ contains a 
maximal element y*. Suppose that Supp p* C M and Ran ¢y* C N, then let ae M \ Supp ¢* and 
be N \ Ran g*, and define y’ = 9* v {(a, 5)}. Since ¢’ is still injective, it is in ®, but this contradicts 
the maximality of y*. Therefore Supp p* = M or Ran p* = N, q.e.d. 


14.6. Well-ordered sets and ordinal numbers 


Order types. 


Similarity is an equivalence relation on ordered sets. The equivalence classes are called similarity 
types of ordered sets. 

The same difficulties apply to this relation as to the relation of equipotence for arbitrary sets. 
And they are avoided in the same way by restricting all arguments to a suitable large family of 
ordered sets. 

Order types are the similarity classes of totally ordered sets. 


Example 2: The set of all real numbers has the same order type as the set of real numbers in 
the open interval (0, 1), because the bijective mapping given in 14.5. preserves the ordering in 
both directions. This is called the order type of the linear continuum, 

Example 3: The ordered set of rational numbers has the following properties: 1. it is countable, 
2. it is dense, that is, between any two distinct elements there is a further element, and 3. it has 
no initial or final element. CANTOR proved that all totally ordered sets with the properties 1., 2., 
and 3. have the same order type n. Thus, this is the order type of every open rational interval, but 
also of the set of all algebraic real numbers in their natural ordering, because this set is the union 
of countably many countable sets and thus countable. 

Example 4: Two finite totally ordered sets have the same order type if and only if they have the 
same cardinal number. The similarity is constructed by first mapping the minimal element of one 
to that of the other, then the minimal element above that, and so on. The finite order types are 
thus in one-to-one correspondence with the finite cardinal numbers. The natural numbers can be 
regarded either as cardinals or as order types. 

Example 5: The order type of the even numbers is the same as that of the natural numbers. Indeed, 
any countable set S can be given an ordering of the same type as the natural numbers by using a 
bijective mapping gy: N — S to define the ordering on S: m? < n® if and only if m <n. 


330 14. Set theory 


| t i. = a4 | j ‘i thea 
SS ee eee ne 6 igre: 


For natural numbers, regarded as order types, the sum and product are the same as those already 
defined. Addition and multiplication of order types are associative and distributive, but not, in 
general, commutative. 


Well-ordered sets. The ordered set of the natural numbers has the following remarkable property: 
every non-empty subset has a smallest element. This property is used in counting the natural numbers 
by always taking the smallest ‘unused’ number, and is the basis of the principle of mathema- 


tical induction. CANTOR recognized the central importance of this property and used it to define 
well-ordered sets. 


By definition every well-ordered set has a smallest 
element. The set of natural numbers is well-ordered 
in its natural ordering. Its (transfinite) ordinal number | | iit 
is denoted by w (Fig.). A segment of a well-ordered 


set S is a proper subset JT of S which contains every cj 

element of S that is smaller than some element of 7. 

If A is a segment of S, then there is always an ele- | | | | jim | | | | termes 

ment aéS, such that A = {xe S| x< a}. With the | 

help of this concept it is now possible to define a wew =2u 

comparison of ordinal numbers. | 

P = an : | | | | J vm | | thie -| | | Ham ---| | dtm « 
aw? = aw 


14.6-1 Schematic representation of some 
ordinal numbers 


Every set of ordinal numbers is totally ordered by <. 

This theorem cannot be reformulated as ‘The set of all ordinal numbers is totally ordered’, because 
the concept ‘set of all ordinal numbers’ leads to a contradiction just as the ‘set of all cardinals’. 

It is immediate that < is transitive. The statement that < is irreflexive is equivalent to the statement 
that no well-ordered set is similar to one of its segments. The opposite assumption leads to a contradic- 
tion. For suppose that there is a similarity g of S onto its segment A. Then there must be elements 
x € S such that x? < x. Let a be the smallest such element and b = a?. Since b < a, it then follows 
that b° < a? = 5, thus b? < 5, contradicting the minimality of a. The proof that any two ordinals 
can be compared is somewhat more complicated. 


Every set of ordinal numbers is well-ordered, in other words, every set of ordinal numbers has a 
minimal element. 


To prove this, let W(x) denote the set of all ordinal numbers less than a given ordinal number «. 
If A is a well-ordered set of type «, then A and W(«) are similar; for every ordinal number B < « 
corresponds to a segment S of A and this in its turn corresponds to an element 5 of A, with 
S = {xe A|x< 5}. Thus, W(«) is well-ordered. If now Z is any set of ordinal numbers, and « 
in Z is chosen arbitrarily, then Z ~ W(x) if not empty, has a smallest element, by the first part, 
and this must be the smallest element of Z. 

The smallest transfinite ordinal number is w. 


Classes of ordinals. If m is a transfinite cardinal number, one can consider the class of all ordinal 
numbers whose representatives have the cardinal m. These sets are called the transfinite classes 
of ordinals. In every non-empty class there is a smallest ordinal number; this is called the initial 
ordinal number of the class. CANTOR combined the finite ordinal number into the first class, and the 
ordinal numbers of countable sets into the second class (see Well-ordering of the cardinals). 


14.6. Well-ordered sets and ordinal numbers 331 


To every ordinal number « there is a larger one, for instance, its successor « + 1; furthermore, 
to any set of ordinal numbers Z there is an ordinal number that is larger than any « € Z. For the 
set Y Wo + 1) is well-ordered, and its ordinal number f is larger than any « € Z. Indeed, B is the 


siniallest such ordinal and is called the supremum of Z (sup Z). 
ee sls i = a, in particular, sup {0, 1, 2, ...} =o. 


Thus, w is a limit ordinal, but all the finite ordinal aaibers are isolated, ands so is w pak 1. 


Recursion principle (transfinite induction). This is an important generalization of the principle 
of induction to pitasteae | well-ordered sets. 


This is very easy to prove. The assumption that the statement is not true for every element of S 
leads to a contradiction. For if a is the smallest element for which it is false (this must exist by the 
definition of well-ordering), then the statement is true for all elements less than a, and therefore, 
by hypothesis, for a, because a is not the smallest element of S. The following principle is more 
intricate. 


This principle can be used to define powers of ordinal numbers «? by a system of recursive equa- 
tions. 


(i) «o®9 =1; (ii) «bt! = of -q; (iii) «4 = sup {x* | € < A} for limit ordinals A. 
Example 6: ow! =@-w=1-w=o0, o =o'!-w=w-o, ---,o® = sup {w, w?, w’,...}, and 
wm” = sup {w®, wo, w, . 
All these numbers belong to the second class, as does every supremum of a countable set of 


numbers of the second class. The first number in the second class that cannot be expressed as a 
sum of powers of @ is 


Eo = sup {w, w®, w®”, ...}. 
This is called the smallest eee It satisfies the equation a ra = ¢. An analogous continuation 
of this process leads to the numbers ¢,, £3, ..., fw, ---, . €e, (e with the subscript «,) 


and so on. The naming of numbers of the second class can be Brule ad infinitum, but it is 
impossible to define a universal notation, because it can be shown that there are uncountably 
many numbers in the class. 


The well-ordering theorem. The preceding arguments do not exclude the possibility that certain 
transfinite classes of ordinal numbers might be empty; in other words, the question is still open, 
whether every set can be well-ordered in at least one way. This is indeed the case and forms the 
content of the well-ordering theorem. This theorem is equivalent to the axiom of choice. The first 
rigorous proof (using the axiom of choice) was given by Ernst ZERMELO (1871-1953) in 1904 in a 
letter to HILBERT. It started the controversy about the admissibility of the axiom of choice, which 
still has not been resolved. 


Well-ordering theorem, On every set S there is a relation under which 5 is well-ordered. 


CANTOR had considered this theorem as a principle of thought and had made it plausible in the 
following manner. Take any element do of S, then a second, and so on. If S is infinite, one obtains 
a sequence dg, a,,... Now either S is exhausted or not; if not, repeat the process, as long as is 
necessary to exhaust the set. If S is ordered by the sequence in which the elements are chosen, then 
every subset has a smallest element, namely the one chosen first. 

By present-day standards of mathematical rigour this argument can only be regarded as a heuristic 
first approach. Rigorous proofs are long and far too complicated to be included here. 


Well-ordering of the cardinals. By the well-ordering theorem no transfinite cardinal m has an 
empty class of ordinals Z,,. The well-ordering theorem can be used to show that any two cardinals 
are comparable, that is, that for any two sets S and T there is an injective mapping from S to T 
or from T to S. This is not surprising, since Zorn’s lemma, the axiom of choice and the well- 
ordering theorem are all equivalent. By well-ordering S and T the statement is reduced to the com- 
parability of ordinals. 


332 15. The elements of mathematical logic 


In addition, it can now be shown that every non-empty set of cardinal numbers K has a smallest 
element; for the set of cardinal numbers is similar to the set of initial ordinal numbers of their 
classes (in fact, the cardinal numbers are often identified with these initial numbers). From this 
identification it also follows that to any set of cardinal numbers there exists a cardinal number 
larger than any element of the set. It is thus possible to index the cardinal numbers by the ordinal 
numbers in the following manner: 


No = smallest infinite cardinal number, 
Nai1 = the smallest cardinal number greater than x,, 
na = sup {xz | § < A} for limit ordinal numbers A. 


This gives the famous Cantor sequence of cardinal numbers No, &;, ---, Nw, -- 


Since 2%* > xo, the problem of the continuum hypothesis can be restated as the question where 
x, the cardinal number of the continuum, occurs in this sequence. Cantor’s continuum hypothesis 


is that 2%* = x, . The so-called generalized continuum hypothesis is that 2°* = x,,,. 


15. The elements of mathematical logic 


15.1. Propositional logic ............. 332 15.3. Formialized theories ............ 339 
15.2. Predicate logic ..............4.. 334 15.4. Algorithms and recursive functions 340 


One of the main tasks of mathematical logic is the investigation of formal thinking and inference 
by means of mathematical methods taken, for example, from algebra or the theory of algorithms. 

But this task, which has its origins in philosophy, is not its only one; nowadays mathematical 
logic comprises a multitude of questions and applications to the most diverse domains such as the 
natural sciences, switchwork algebra, the theory of data processing systems, linguistics, several 
branches of the social sciences*like philosophy, laws, und ethics. 

A decisive impetus towards the development of mathematical logic came from the situation of 
mathematics at the end of the 19th century. Up to that time mathematics had gathered together 
an abundance of individual results and had reached a high degree of abstraction, without having 
achieved a corresponding clarity about the contents of the fundamental concepts, which were used 
in an intuitive manner, for example, the concept of a set or of logical inference (see Chapter 42.). Apart 
from the need for an unquestionable foundation of the concept of a set, for the first time it 
became necessary to gain insight into the proper meaning of logic and logical deduction. 


15.1. Propositional logic 


Principles of classical propositional logic. Propositions is the name used for certain linguistic 
formations that serve to describe and to communicate facts. Classical propositional logic starts 
out from two assumptions. According to the two-value principle every proposition is either true or 
false. The concept of truth used here, which goes back to Aristotle, regards a proposition to be true 
if the statement asserted by it corresponds to a fact. The two-value principle really subsumes two 
principles: 

1. the principle of the excluded middle according to which every proposition is true or false, and 
2. the principle of the excluded contradiction according to which there is no proposition that ist both 
true and false. Therefore the class of all propositions splits into two disjoint subclasses, which 
are denoted by the symbols 1 (true) und 0 (false) and are called truth values. 

By means of linguistic particles such as ‘not’, ‘and’, ‘or’ etc., given propositions can be combined 
into more complicated propositions. According to the second fundamental principle, the principle 
of extensionality, the truth value of a compound proposition is determined exclusively by the truth 
values of its components and does not depend on their meaning. Consequently such combinations 
can be regarded as functions that assign truth values to m-tuples of truth values. 

The connective particles most frequently used in propositional logic correspond to truth functions: 
the function non corresponding to ‘not’, et to ‘and’, vel to ‘or’, seq to ‘if ..., then ...’, and aeq 
to ‘if and only if ...’. These functions are determined as follows: 


15.1. Propositional logic 333 


These definitions do not quite agree with the meaning of the particles in everyday language. 
For example, the following proposition is true: 

‘If 2: 2 = 5, then the moon is inhabited by conscious living beings’; because in functorial nota- 
tion (0 —> 0) = 1. 

The task of propositional logic consists in the mathematical analysis of these concepts, which 
for the purpose are formalized within the framework of a calculus, the propositional calculus. To 
develop it one starts out from a collection of fundamental symbols of the following kinds: 

({) variables for propositions: py, P2, ---3 Dy Gy 1, Sy -++5 
(ii) functors ~, A, Vv, +, <, denoting in this order the functions non, et, vel, seq, aeq; 
(iii) technical symbols: (,). 

Among the set of all sequences of symbols, the fundamental objects of the propositional calculus, 

the so-called expressions, are now selected by means of an inductive definition: 


This definition makes it possible to decide in finitely many steps whether a given sequence of 
symbols is an expression or not. 


Example J]: The following are expressions: ((p + g)A(rvs}) and ((p+#q)—~(~q— ~p)). 


To simplify the presentation of expressions one uses rules for saving brackets: 


(i) If the whole of an expression is included in brackets, these brackets may be omitted. 

(ii) In the sequence ~, A, v, ~, « each functor separates stronger than the preceding one; for 
example, p A q — r is to be read unambiguously as (pA q) > r. 

(iii) A functor marked with a dot beneath separates stronger than one without a dot (see Exam- 
ples 3., 4., 5., 6.; in 6. two dots separate stronger than one dot). 


Semantics provides a link between the truth values and truth functions on the one hand, and the 
expressions on the other. This is done by means of the notion of a covering. A covering of the proposi- 
tional variables is a function that assigns to each variable one of the two truth values 0 or 1. Such a 
covering f can be extended in a natural way to a function vy that assigns a truth value to every 
expression. For a given f this function v, is defined inductively: 


(i) For variables p one has: v;s(p) = f(p); (iii) for expressions H and G one has: 

(ii) vs(~H) = non (o,(71); v(HAG) = et (v,(H), v,(G)), 
v(HvG) = vel (v;s(7), v,(G)), 
v(H + G) = seq (v,(H), v,(G)), 
v,(H @— G) = aeq (0, (A), v,(G)). 

The important concepts of semantic equivalence and universal validity can now be defined. Two 
expressions H and G are said to be semantically equivalent, in symbols H= G, if v,(H) = v,(G) 
for every covering f. An expression H is said to be universally valid or a tautology if v,(H) = 1, in 
other words, if H is true for every covering f. 


Example 2: p + (q > p) isa tautology; p + (p > q) > qisa tautology; (p ~ q)A(p—+ ~q)+~p 
is a tautology, by the principle of the excluded middle. 


Logical inference serves to obtain new true propositions from given propositions that have already 
been established to be true. Therefore the rules of inference to be used must carry over the 
truth of an expression to the deduced expression. In the derivation of such rules of inference tautologies 
play a particular role: every tautology of the form H — G leads to a rule of inference. The conditions 


334 15. The elements of mathematical logic 


for the application of a rule, the premisses, are written down above a horizontal line, the result 
of the application, the conclusion, below. A system S of rules of inference determines a relationship 
«A can be derived from S’, in symbols S}+- A. 


Examples for rules of inference, in which H, G, F denote expressions and Sa set of expressions. 


3.p+(p>q)-q SHA 5. The contraposition St H+ ~G 
leads to the rule: Se H-G p+~q-q>~p S-G+ ~H 
S-G leads to the rule: 
4. The chain of inferences SHE-AH+G 6. The principle of contradiction S\|-H+G 
(p>q)>(q>r)>por S|tG>F p+q=po~gs~p S-H+~G 
leads to the rule: SH H>F leads to the rule: S-~H 


The rules of inference of the propositional calculus do not take account of the finer structure of 
propositions; further-reaching rules of inference are treated in the predicate calculus. 


15.2. Predicate logic 


The expressions of the propositional calculus are not sufficient to formulate the facts occurring 
in mathematics; a formalized version of the mathematical language must be considerably richer. 
A characteristic feature is the frequent use of variables and of special symbols for functions or for 
relations. Variables are preassigned symbols, which denote arbitrary objects of a previously delineated 
domain. Symbols whose meaning is fixed are called constants, such as 0 and + within the domain 
of natural numbers. 

A further feature of the mathematical language is the possibility of binding variables by means 
of the quantifiers of predicate logic. 

In the expression ‘there exist prime numbers p and q such that 22 = p + q’ the symbols p and q 
are bound by the predicate-logical functor ‘there exists ...”, while the variable n is free. It has turned 
out that for the purposes of binding variables in mathematics the two operations of predicate logic 3 
‘there is .... and V ‘for all ...’ are sufficient. Therefore the languages of predicate logic are based 
on this kind of binding variables only. 

In predicate logic the finer structure of mathematical statements is investigated; for example, the 
propositional calculus is incapable of grasping the statement about rational numbers 


Vx Vy aze(x <Co ypox<z<y). 


Syntax of elementary languages. The propositions of a mathematical theory contain as fundamental 
concepts certain predicates and functions; for example, in the theory of sets the relationship ¢€ 
‘,.. is element of ...’, in geometry the relationship of incidence and of betweenness, in arithmetic 
addition, multiplication and the relationship of order. For these fundamental concepts symbols are 
introduced, which taken together form the signature of this theory. A signature therefore consists 
of symbols for relations, for functions, and for individuals. Each of these symbols has its appropriate 
valency or ‘arity’. In the signature 2’ = {+, -, <<, 0, 1} of elementary arithmetic + and - are binary 
operational symbols, < ts the symbol of a binary relation, and 0 and 1 are symbols for individuals. 

Apart from the symbols in 2 a mathematical theory uses variables for individuals, such as the 
symbols x, y, z, .--, logical symbols such as ~, A, v, >, «, =, J, V and auxiliary technical symbols. 

Just as in the propositional calculus, an elementary language Ly (or language of the predicate cal- 
culus) can now be defined with reference to a given signature +’ by means of these fundamental 
symbols. Its elements are certain sequences of symbols called expressions or propositional forms. 
The construction of these expressions takes place after introducing the so-called terms. 


Example 7: If sin, +, - are the usual function symbols interpreted in the domain of real numbers, 
then the following sequences of symbols are terms: sin x, x? - y + y*? + z3, sin (x + sin (y? + x)). 


15.2. Predicate logic 335 


(ii) If A and B are expressions, then so are ~A, (A A B), (A v B), (A + B), (A @ B). 

(iii) If A(x) is an expression containing the variable x, but not the symbols 3x or Vx, then so 
are dx A(x) and VxA(x). | 

(iv) A sequence of symbols is an expression only if it is formed in accordance with (i) -— (iii). 


Example 8: In the signature © = {P,Q, R, f.g,T} the following sequences of symbols are 
expressions: Wx(Rxy + QOx/(y)). ~ axl Rxy v Oxg(), x)), ValPx 4 IA Tyx > Sxy)). 

Just as in the propositional calculus, it can be decided in finitely many steps whether a given 
sequence of symbols is an expression or not. 

A variable x occurs in an expression H freely if x occurs in H but not 3x or Vx; x is quantified 
in H if 3x or Vx occurs in it. After every place of the form ©x, where © is J or V, there begins a 
uniquely determined partial expression H’ of H in which the variable x without this symbol Ox 
would be free. This partial expression H’ of H is called the domain of influence of the quantifier O 
for the place in question. Within it the variable x is quantified. A variable x occurs freely at a certain 
place in an expression H if it occurs at that place and is neither quantified there nor within the 
domain of influence of a quantifier. If there is at least one place in H at which a variable x occurs 
freely, then x is said to occur freely in H. 


1 be 1012 13 20 F: Fd 30 
Example 9: In the expression Ix[Px AO A gy) = z]> (3x Vpi RAY AS (x, z) = z] 
the places are indicated by numbers over the symbols. The variable » occurs freely at the places 8 
and 12 and bound at 22 and 25; it is quantified at 22, and at 25 it is within the domain of influence 
of the place 21. 

In general, expressions are not propositions. The expression x < y, in which < denotes the order 
among natural numbers, only becomes a proposition if definite symbols are substituted for the 
variables x and y, for example, 0 < 1, 3 < 2, 5 < 7, or when the free variables are bound by quan- 
tifiers, for example, Vx Jy x < y. Propositions can therefore be characterized as those expressions 
in which there are no free variables. 


Examples of propositions: 

10, The monotonic law for addition of natural numbers: Wx Vy Wz (x << y>x+z2< y+ 2). 

11, Fermat's conjecture: ~4x jy 3z4n (n > 2A x" + y" = 2"). 

12. Goldbach’s conjecture: Vx[2 | x A x 4+ 2A x + 0— Jy 3z (prime ya prime zA x = y+ 2)]; 
here prime y is an abbreviation for y += 1A VuVWv (y = w+ v—+u=1vv=1), and 2| x is an 
abbreviation for dy (y + » = x). In words the expression reads: ‘For all natural numbers x, if 
x is even, non-zero, and not 2, then there exist prime numbers y and z such that x is the sum of 
y and z’. 

A generalization of the elementary languages is obtained if one allows quantification of unary 
predicates, that is, if one treats them like individuals. In such languages, which are also called monadic 
of the second order, considerably more statements can be expressed than in the elementary languages. 


Examples of propositions in monadic languages of the second order: 
I3. Peano's axiom for the natural numbers: 
VP(P 04 Vx(Px —+ Px’) — Vx Px), 


in words: ‘If a unary predicate P holds for 0, and if P holds, together with an element x,‘also for 
its successor x’, then P holds for all natural numbers’. 


I4, Axiom of the least upper bound for real numbers: 
VP[3z Pz 4 JuWe( Pr v <u) Sy(Vo(Pe > ov S py) a ~ Sy'(We(Pe oe sy’) “A 3" < y))); 
in words: “Every non-empty set of real numbers that is bounded above has a least upper bound’. 


The languages of predicate logic are descriptive, that is, the expressions of such languages describe 
relationships prevailing in mathematical structures. 

With the development of data processing by machines algorithmic languages gain in importance. 
Algorithmic languages have the purpose of making commands, initiating actions, and directing 
processes. Examples of algorithmic languages used in programming technology are ALGOL 60, 
PL 1, FORTRAN, COBOL and others. 

Some algorithmic elements are contained even in elementary languages: a term can be regarded 
as a sequence of commands to be carried out; for example, (x + 1)- y denotes the sequence ‘add 1 
to x and then multiply the result by y’. 


Semantics of elementary languages. Just as in the proportional calculus, semantics establishes a 
connection between the expressions of Ly and the realm of mathematical structures in which the 
expressions have a meaning. 


336 #15. The elements of mathematical logic 


Let &' be a set of operational and functional symbols, and S a non-empty set. By an interpretation 
of &' in S one understands a mapping 6 that assigns to every n-ary relation symbol R in 2 an n-ary 
relation R® on S, that is, a subset of S”, and to every n-ary operational symbol F an n-argument 
function F® on S, that is, a single-valued mapping of S”" into S. The equality symbol = is always 
interpreted by the relationship of identity. 

Let ds denote the sequence of the relations and operations assigned to the symbols in ~. 

A 2-structure, 2-algebra, or -model is defined as an ordered pair S = (S, ds). Let Ky denote 
the class of all 2-structures. The symbols contained in 2 always refer to the class Ky. A truth 
concept can now be set up for the elementary languages Ls, that is, a definition of the statement: 
‘the proposition H is true within the structure S’, symbolically S — H. This concept is fundamental 
for the whole of semantics. The truth concept in elementary languages can be made precise by 
introducing first a more general concept: ‘the S-cover « satisfies the expression H in S’, symbolically 
SE, H. 

By an S-cover « one means a function that assigns to every individual variable an element of S. 
Such a cover « can be extended in a natural way, just as in the propositional calculus, to a mapping « 
of all terms of Ls: into S: c* = c®°, where c is a constant for an individual in Z: 


(F(t,, ---5 tn))* = Fott, ..., 02). 
Example 15: Let t = (x+1)- y, «(x) = 2, «(y) = 3. Then f* = (x*+1)+ y* = (2+1):3 =9 
Definition of the relation ‘« satisfies A in S’, in symbols, § —, A, in which iff stands for | 
‘if and only if’. 
(i) St, Rt, ... ty iff (#2, ..., 1%) R°, that is R holds for the n-tuple (1, ..., ¢*): 
(Gi) SE,~A _ iffnot SE, A; 
SF,AAB iff SE, AandSE, B; 
SE=,AVB if S—,AorSE, 8B; 
SeE,A—~ SB iff SE, A implies § —, B; 
Se,AeB iff S—,A—> BandS—, B- A; 
(iii) S =, 4x A(x) iff the value of « for the variable x can be modified so that the modified cover 
| «’ satisfies the expression A(x) in S; 


SE, Vx A(x) iff every cover « arising only by modifying the value for the variable x satis- 
fies the expression A(x) in S. 


Example 16: 3x(y = x-+ x), where - denotes the multiplication of natural numbers. If « is a 
covering of all variables such that y* = 4, then « satisfies this expression; for the covering « 
that assigns to the variable x the value 2 and agrees with « for all the remaining variables satisfies 
the expression y = x x. 


This example shows that whether S —, A holds or not depends only on the variables occurring 


freely in A. If A is a proposition, that is, an expression without free variables, then S E, A holds 
for all « or for no «. 


Definition. (i) An expression A is valid in S, | in symbols 5 E A, if and only if every cover « 
satisfies A in S, that is, if S =, A holds for every S-cover « 


(ji) an expression A € Ly is called universally valid (or valid in predicate logic) if A is valid in 
every 2-structure. 


Examples: 17. The proposition Vx dy x < »y is valid in the domain of natural numbers; but it 
is not universally valid, because it is false in a finite ordered set (S, <_). 

18. The proposition Vx Vy Rxy v ~Vx Vy Rxy is universally valid; indeed, SE H or S=& ~H 
holds for every proposition H and every structure S. 


An expression H is said to be indecomposable in propositional logic if it begins with a quantifier, 
that is, if H is of the form H = OxH’, where © stands for 3 or V. Every expression is composed 
from indecomposable expressions by means of the functors ~, A, v, ~, and «. When one substitutes 
propositional variables for the indecomposable components of an expression, one obtains an 
expression of the propositional calculus; for example, the expression Vx Vy Rxy v ~Vx Vy Rxy 
goes over into the tautology p v ~p. An expression H is said to be universally valid in propositional 
logic if the corresponding expression of propositional logic is universally valid within the framework 
of the propositional calculus. If H is universally valid in propositional logic, then also in predicate 
logic. However, there are expressions that are universally valid in predicate logic, but not in proposi- 
tional logic; an example is: Vx Px — 3x Px. This is the reason for the occurrence in predicate logic 
of methods of inference that are not Masel in the preposnonsc sisal 


15.2. Predicate logic 337 


Let © = {+, 0}, and let S C Ly be the following set of expressions: 
Gi) (x«+y+2z=x4+(74+2), (iii) Vx Ay(x + y = 0), 
(ii) Vx(x + 0 = x), (iv) Wx Vy(x + y= y+ X). 
Then a 5-structure M = (M, +-, 0) is a model of S if and only if M is an additively written Abelian 
group, and Mod S is the class of all Abelian groups. 


Two expressions H and G are called semantically or logically equivalent if the expression H < G 
is universally valid. 


Examples of logical equivalences: 
19, ~dx A(x) = Vx ~ A(x). 20. ~Vx A(x) = dx ~ A(x). 
21. OxA(x) = OyA(y) if y does not occur in A(x) nor x in A(y) und © is one of the quantifiers 
Jor V. 
22. 4x[A(x) A B(x))] = Vx A(x) A Vx B(x). 23. ax[A(x) v B(x)] = ax A(x) v ox B(x). 
Expressions of the form ©,x,; ... O,x,A(x1, .--, X,), Where each ©, is 3 or V and A is free from 
quantifiers, are said to be in prenex form. 


Every expression is logically equivalent to an expression in prenex form of the same signature 
and with the same free variables. 


Examples for the transformation of an expression into a prenex form logically equivalent to it. 


24.Wx Wy[x < y— Jz(x < z < y))] = Vx Vy az(x < yox <z< y). 
25. Wx dy Vz Oxyz — Vy 3z ~ Ryz = Wu av 3x Vy 3z ~ (Qxyz a Ruv). 


Mathematical inference. Mathematical inference serves to obtain new true propositions from 
given true propositions. The backbone of mathematical inference is the conclusion. If S is a set of 
propositions that are true in a structure § and if a proposition A can be deduced from S, then A 
must preserve its truth in S, that is, the proposition A must be true in the structure S. 


= 


a 1 eo ot. Bie I er OP ee ee Oe eee eee eee) ie eee 


For example, if S is the following system of axioms: 
Vx Vy V2(x-y)° z= x(y-z), WxVyx:y=y-x, VWxVyVe(y x =2z'°x>y=2z), 


then a proposition H follows from S if and only if H holds in every commutative cancellation 
semigroup. Therefore the set of consequences of S contains all the propositions of the elementary 
theory of the class of all commutative cancellation semigroups. 

In mathematical inference one does not always go back to the definition of a conclusion, but one 
makes use of certain rules of inference, which are hereditary under conclusion, that is, are valid for 
the conclusion process. 


Examples of conclusion-hereditary rules of inference. 
26. Separation rule: 27. Derivationrule: 28. Deduction theorem: 29. Indirect conclusion: 


Si- H Sv {A}i- B SitA+B Sv{A}ir B 
Sit H+G SEAS Ey SU{A}i- B Sv {A} ik ~B 
Si-G S i+ ~A 


For every system R of rules of inference of this kind one can define a relationship of derivability: 
The expression A is derivable, provable, from the set S’. An expression A is derivable or provable 
from S by means of the R-rules if one can obtain A from certain initial expressions belonging to S 
by applying rules in R finitely many times. A proof or a derivation of A can be regarded as a finite 
sequence of expressions (F,, ..., F,, 4) which can be obtained successively from S$ by application 
of the rules in R. If the set R of rules and the initial set S are finite, then it can always be decided 
in finitely many steps whether or not a given finite sequence of expressions is a proof. 


The conclusion relation, which is at the basis of every inference, can be characterized by a 
finite system of rules of inference. 


In accordance with this fundamental fact, in what follows a system of rules of inference is indicated 
that is adapted as far as possible to the natural deduction process. For each of the functors ~, A, v, 
—, «, J, V two rules of inference are given, one to introduce it and one to remove it. The relationship 
of derivability laid down by these rules of inference is denoted by the symbol +. 


338 15. The elements of mathematical logic 
Definition of a system of rules of inference. 
(Oa)AES (0b) S+ A, SCS’ 
SHA SrA 
(la) 5,A/ B (lb) S+} A,A>B 
SHA>B St B 
The rule (la) corresponds to the deduction theorem for conclusion (see Example 28). 
(2a) 5, A}+ B, ~B (2b) S, ~A + B, ~B 
SE ~A | SKA 
(3a) Se A,B (3b) SE AAB 
SHAAB St A,B 
(4a) St A : (4b) S+ Av B,A>+C, BC 
SHAVB, BVA SEC Z 
(5a) S+ A> B, BoA (5b) S+ AB 
SHKAeB S+A—~B, BoA 
(6a) St A(t) (6b) S} 4x A(x), A(y) > B 
S- ax A(x) SEB a 
In (6a) ¢ is an arbitrary term, in (6b) y does not occur in # nor in S. 
(7a) St A(y) (7b) SE Wx A(x) 
SE Wx A(x) SE A(t) 
In (7a) y does not occur in 5. 
(8b) S/ A(t), f= 0° 
| In (8a, b) ¢ and r’ denote terms of the language L¥. Sra 
All these rules are conclusion-hereditary, that is, ‘if S|} A, then S It A’ holds. 
Apart from these rules mathematical inference makes use of a number of other rules of inference 
that can be derived from the given ones. Examples are the following: 


(lc) SH# A>B (2c) SH~~A (4c) SH AVB 
S,AEB SEA S,;~A-B 

(6c) S, A(x) B (8c) St A(t), t=?’ 
S, dx A(x) B (x not free in B nor in S) St A(t // t’) 


In (8c) A(t // t)) means that the term must not necessarily be replaced by the term ¢’ at all places 
where it occurs in A(t). 
The following important theorem holds for the relationship of derivability. 


Theorem on the completeness of the relationship of derivability. For every set 5S of formulae in 
Ly and every formula Ac Ly S\- A holds if and only if § |~ A. In particular, © |- A if and only 
if @ |~ A, that is, A is universally valid if and only if A is derivable without axioms, in other words, 
from the empty set. 


Example 30: Strictly formal derivation of a number-theoretical statement. By way of explanation, 
after the lines of the formal proof the meaning of the expressions and the natural conclusions 
corresponding to the formal derivations are added in square brackets. 

Wx (~3y x = 3+ y— 3zx? — 1 =3-2). 
[For all integers x, if x is not divisible by 3, then x? — 1 is divisible by 3.] 
Let B(x, z) be an abbreviation for x? — 1 = 3-z. According to rule (7a) it is sufficient 
to show that S}+ ~siya = 3+ y— dz Bia, z). 
[It is enough to prove the assertion for a fixed but arbitrary integer a.] 
By (la) it is sufficient to show that 
5, ~dya = 3+yt- dz B(a, z). 
[Assuming that a is not divisible by 3, it has to be shown that a? — | is divisible by 3.] 
The fact that S+} Jxa=3-xvixa+1=3-:xvixa—1=3-«x is taken to be 
known. By (4c) then 
S,~dxa=3-x-dxa+1=3-+:xvixa—l1=3-x. 
[Since a is not divisible by 3, either a +- 1 or a — | is divisible by 3.] 


15.3. Formalized theories 339 


By (4b) it is now enough to show that 

(i) S,jxa+1=3:x t+ 3zB(a,z) and (ii) S,ixa— 1 = 3x} Jz BQ, z). 
[Two cases are to be considered: (i) a + 1 is divisible by 3, (ii) a — 1 is divisible by 3.] 

Only (i) will be proved, similar arguments hold for (ii). By (6c) it is enough to show that 

S,a+1=3-6} 3z Bla, z)and by (6a) S,a + 1 = 3-5} B(a, t) fora certain term tf. 
[Let b be one of the elements x for which a + 1 = 3- x; it is sufficient to indicate a number ¢ for 
which a? — 1 = 3-1f.] 

Now SE (a + 1) (a — 1) = a? — 1, hence by application of the rules (8a), (8b), and (8c) 

S,a+1=3-b}+a*—1=3:-56:(a—}), 

that is, the term tf = b- (a — 1) has the required properties. 


15.3. Formalized theories 


The formalization of a theory proceeds in several steps. First of all, the domain of objects and 
the appropriate relations have to be laid down. On this level, the first mathematical concepts are 
obtained, by abstraction from real situations; for example, the fundamental geometrical concepts 
such as point and line arose by abstraction from reality. In the second step the concept of a proposition 
is made precise, and an interpretation of the propositions on the domain in question is defined. Finally, 
an axiom system and a relationship of derivability are given. An axiom system should aim at com- 
pleteness, that is, it should characterize the domain in question completely. This means: every 
proposition that is valid in this domain should be derivable from the system of axioms. 

Most mathematical theories refer to certain classes of structures. The theory of a class K of 
structures can be identified with the set of Pee that a are e valid 1 in every structure of this class. 


“TK 7 is che eecatay heen of the class K of s structures. A set X of propositions is an axiom 
system for a theory T if X* = T and if X is decidable, that is, if for every expression He Ly it 
can be decided in finitely many steps whether H € X or H¢€ X. 


Examples of formalized elementary theories. 
31. The theory of fields with © = {+-, +, 0, 1} is characterized by the following system of axioms: 
Wx Vy Wzeix+(yt2=(+y)+2), VWs +0=-0+x=4%], 


Vx Iy(x + y = 0), Vx Vy(x + y= y+ x), 
Vx Vy Wz[(x- y)*z=x-(y-2z)], Wx V(x y = yx), 
Wx(x- 1 =x), Wx[x + O- Iyp(x-y = 1), 


Va Vy Vz(x + y) z= x-z+ yz). 
32. The theory of linearly ordered sets with © = {<} is characterized by: 
m~wix(x < x), Vx Vy Vz(x << yAYz+x< 2), 
Vx Vox = pv x<ypvy< x). 
33. The theory of groups with X = {-, 1} is characterized by: 
Vx Vy W2z[(x- y)*z=x-(y-2z)], Vx(x* 1 =x), 
Vx dy(x-y = 1). 


Definability in formalized theories. Frequently in a mathematical theory 7, apart from the fun- 
damental symbols given by the signature 2, new concepts, predicates, and operations are defined. 
For example, in the arithmetic of natural numbers the relationship x | y of divisibility ‘x divides y’ 
can be defined as follows: x | y= g-.3z (y= x~-z) or the relation aS 5 as follows: a < b 
=der Jx(a + x = 5). Such explicit definitions themselves can be characterized as formal propositions 
of a particular kind. If in this example one extends the initial signature Y = {+, -, 0, 1} of elementary 
arithmetic by adding the binary predicate symbol | then to the arithmetical axioms the proposition 
Vx Vy[x | y « Jz(y = x z)] can be added, which is called the definition of the predicate x | y. 

Also functional and individual symbols can be introduced by definition. An explicit definition 
of an n-argument function F in a theory T has the following form: 


Vx .-. WxX,_ SylFxy .-. X_ = yO A(X, ---> Xn, YI 
where it is assumed that in T the propositions 
Vx... WX_ dy A(X, +--+) Xn y) and Wx, ... Vx, Vy W2[A(Xy, ---) Xn, VIA A(Xy, «Xn DAY = TJ 
are derivable. 


340 15. The elements of mathematical logic 


Definition. If in an elementary theory T with the signature a relation R is an element of 
2 and if 2” < 2 \ {R} is a subset of 2, then the relation R is said to be explicitly definable in T 
if there is a definition for R derivable in T whose defining expression contains symbols of 2” 

_only. 


If in a theory 7 a relation R is explicitly definable by means of the remaining relations, then 
every expression can be transformed equivalently in T into an expression that does not contain 
the symbol R. Consequently definable relations are dispensable. But from the methodological 
point of view the search for suitable definitions is just as important as the search for a suitable 
proof. 

If a predicate R is definable within a theory J by means of predicates Q,, ...,Q,, then in every 
model of 7 the interpretation of R is uniquely determined by the interpretations of the Q;. Hence 
the following principle is valid. 


Padoa's principle. The fact that a predicate R cannot be defined within a theory T in terms of 
predicates Q,, ...,Q, can be established by indicating two models M and M’ that differ solely in 
the meaning of R. 


Axiomatic definitions are of a different kind. They aim at grasping a concept or a relation of a 
domain of objects axiomatically, that is, to characterize them by means of a set of propositions. 
For a class K of structures this means that an axiom system is to be given for the elementary theory 
of K. 


15.4. Algorithms and recursive functions 


Within the framework of mathematics and logic, algorithms made their appearance as general 
methods for the solution of all problems of a given class. Their purpose is to describe processes in 
such a way that afterwards they can be imitated or governed by a machine. Examples of algorithmic 
processes are logical inference and certain calculating processes occurring in mathematics, in par- 
ticular, solution methods for various types of equations. 

A characteristic feature of an algorithm is that it transforms given quantities (input data) into 
other quantities (output data) on the basis of a system of transformation rules. But it only makes 
sense to talk of an algorithm if certain additional conditions are satisfied: 


(i) The system of quantities to be transformed into one another must be given effectively; 

(ii) the algorithm must be describable by finitely many rules because no machine can store in- 
finitely many rules; 

(iii) the transformation of quantities, the working of the algorithm proceeds in the form of mecha- 
nical working units, each unit consisting in the application of one of the given rules. 


During the period 1931-1947, within the framework of mathematical logic, a number of well 
delineated concepts of an ‘algorithm’ were developed, making the intuitive notion more precise. 
The most important are the calculus of equations (J. HERBRAND, K. GODEL, S. C. KLEENE, about 
1931 to 1936), the Turing machine (A. M. TuRING 1936), the A-calculus (A. CHURCH 1936) and 
the algorithmic concepts of E. L. Post (1936) and A. A. MARKov (1947). 

Of great significance is the fact that all these notions are equivalent in the sense that the same 
number-theoretical functions, the so-called recursive functions, can be calculated by each of them. 
Here a number-theoretical function is understood to be one that is defined in the domain of natural 
numbers. On the basis of this equivalence one can take the view that the intuitive concept of an 
algorithm is comprehended in the newly gained precision. This point of view was formulated in 
1936 by CHURCH and is known in the mathematical literature as Church’s hypothesis. 

A number-theoretical function f is said to be calculable if there is an algorithm by which the value 
f(n) can be found for every value 7 of the argument. 

Examples of calculable functions. 

34, Let f(x) be the xth prime number. Here the method of the sieve of Eratosthenes (see 
Chapter 1.) can be used to calculate the function. 

35, Let f(x, y) be the greafest common divisor of x and y. This function can be calculated by 
means of Euclid’s algorithm (see Chapter 1.). 

36. Let f(x) be the xth digit in the decimal representation of 7 = 3.14159... Here a convergent 
series representation of z is suitable for the calculation of the function (see Chapter 21.). 


The class of recursive functions arises by making the intuitive concept of a calculable function 
more precise. Certain initial functions, which can be regarded to be immediately calculable, are 
called recursive, and certain rules are specified by means of which new recursive functions can be 


15.4. Algorithms and recursive functions 341 


generated from given ones. The rules are such that for every new recursive function one can indicate 
at once an algorithm to calculate the function values if such algorithms are available for the given 
recursive functions. 


A. Initial functions 


(i) The identity functions I™(1 = m =< n) are defined by the equations /?"(x,, ..., X,) = Xm3 | 

(ii) the constant functions F™ are defined by the equations F(x, , sees Xm) = ec, in which cisa fixed 
natural number; 

(iii) the successor function is defined by f(x) = x + 1. 


B. Generating rules for functions 


(() The substitution of functions. If f is a &k-argument function and 2,, ..., g, are n-argument 
functions, then the relation g(x,, --.,%,) = /[g,(%, ---, Xn), ---» 2g(%1, ---, X_,)] determines 
an n-argument function. 

(ii) Primitive recursion. If h is a (A + 1)-argument and g a (A — 1)-argument function, then the 
following system of equations determines a unique £-argument function: 


F(X 5 ney Xyin sO) = BCX yy ey Xp), 
f(x, cory Mpg F T 1) = htx, potty Apay VS (xy pte Xp-1y)). 
Existence and uniqueness of this function is guaranteed by Dedekind’s justification theorem 
(see Chapter 3.). 

(iii) The formation of a minimum. If f is a (kK + 1)-argument function such that for every k- 
tuple (x,, ...,.%.) of natural numbers there exists a number y with f(x, ..., x., ») = 0, then 
a new function g is determined by the stipulation that g(x,, ..., x.) is the smallest y for 
which f(x,, ..., %, ¥) = 90 


A number-theoretical function is said to be recursive if it is an initial function or if it can be 
generated from initial functions in finitely many steps by the application of the rules stated. If 
only the rules B (i) und B (ii) are admitted, then the resulting class of functions consists of the 
so-called primitive recursive functions. 


Examples of primitive recursive functions. 
37. The sequence of Fibonacci numbers: f(0)= 1, f(lI)=1, f(x + 2) =Sf(x + 1) +f (x). 
38. The function f(x, y) = x + y is obtained by primitive recursion from the primitive recursive 
functions A(x, y, z) = z + 1 and J}(x) = x: 
x4+0=H(Q=x, x+(y+ 1) = Ax, y,x+4+ y). 
39. The function g(x, y) = x: y results by primitive recursion from the primitive recursive 
functions A‘(x, y, z) = x + zand Co(x) = 0: 
e(x,0)=x-0=—Co(x)=0, gx, y+ 1) =A’, y, xy). 
40. The function e(x, vy) = x” results by primitive recursion from the primitive recursive func- 
tions A(x, y,z) = x°z,C (=I: 
e(x,0)=Cyx%)=1, ex,y+ 1) =A’, y, eC, y)). 


Owing to the previously mentioned equivalence of the various concepts of algorithm the view 
can be taken that the class of calculable functions coincides with the class of recursive functions. 


Church's hypothesis. A number-theoretical function is calculable if and only if it is recursive. 


The decision problem. A precise concept of algorithm was a necessary prerequisite for the investiga- 
tion of the question whether certain problems are algorithmically soluble. Such questions were dis- 
cussed even in the Middle Ages. For example, around 1300 Raymundus LuL.us developed the idea 
of an Ars magna, by which he meant a general method of finding all possible truths. These ideas 
reached their first climax when LEIBNIZ (1646-1716) recognized that strictly speaking the concept 
of an Ars magna comprises two concepts, namely that of an Ars iudicandi, a decision method, and 
that of an Ars inveniendi, a method of generating and axiomatizing. After LErBNIz these ideas were 
not developed further. One of the reasons was that the formalizing and interpreting technique of 
mathematical logic, which is necessary for such investigations, did not exist yet. 

But by means of the recursive functions a precise version of the decision and generating method 
can be indicated. These concepts are first defined for sets of natural numbers. 


Definition of the decision and generating method. 
(() A set S of natural numbers is recursively enumerable if and only if there exists a recursive 


function /( whose range of values coincides with S. This function / evidently provides a gener- 
ating method for the set S. 


342 15. The elements of mathematical logic 


The set of all even numbers is decidable. 
The set of all Fibonacci numbers is decidable. 
The set of all prime numbers is decidable. 


The original unrestricted concept of an algorithm does not refer to natural numbers only, but 
also to more general objects, for example, the algorithm for the differentiation of polynomials. 


Non-numerical algorithms can be reduced to recursive functions and recursive sets of natural 
numbers. 


Let K be a class of non-numerical input and output data; suppose that a one-to-one mapping 
of this class into the set of natural numbers is fixed. This mapping, which is called a codification, 
is assumed to be chosen so that 
(i) it is itself given by an algorithm; 

(ii) there exists an algorithm to decide whether a number is an image of a non-numerical object 
in K, and if so, to construct this object; 

(iii) such a codification is to be used only when there exists an algorithm to get hold of the non- 
numerical class K. 

By identifying the objects of a non-numerical class with their code numbers the decision problem 
for subclasses of K can be reduced to the decision problem for certain sets of natural numbers. 

Of particular importance are decision and axiomatization problems for mathematical theories, 
especially elementary theories. In the investigation of such questions one starts out from a codification 
® that assigns a natural number to every sequence of symbols taken from the set 

A= LY {~, A, V, >, ©, FV, X15 X25 +++} 
of an elementary language. Such codifications are easy to find. 


By means of these definitions the medieval attempts to create an Ars magna obtain an exact 
meaning. The first important results in this direction are due to GODEL. He proved in 1930 that the 
universally valid expressions of an elementary language are axiomatizable, in other words, can be 
generated in the sense of the Ars inveniendi. Subsequently GOpDEL obtained an even more important 
result: he proved that the elementary theory of numbers is not axiomatizable, that is, there is no 
algorithm to produce precisely the propositions that are valid in the domain of natural numbers 
N = (N, +, -, 0, 1). Of course, such a proof can only be given when there is a general definition 
of the concept of algorithm. GODEL based his proof on the concept of a recursive function and so 
gave at the same time an example of an algorithmically unsolvable problem. 

From then on a number of other elementary theories have been proved to be undecidable. 


The elementary theory of groups is undecidable. 

The elementary theory of fields is undecidable. 

If the signature + contains an n-ary relation symbol with m > 2, then the set P» of the logically 
valid expressions is undecidable. 


In 1970 a famous problem was answered in the negative. This is Hilbert’s tenth problem, which 
he proposed in 1900, at the First International Congress of Mathematicians in Paris: does there 
exist a universal algorithm to solve arbitrary Diophantine equations? 

But there are also some decidable theories. 


The elementary theory of the field of real numbers is decidable. 
The elementary theory of Euclidean geometry is decidable. 
The elementary theory of Abelian groups is decidable. 
It has turned out that every sufficiently expressive theory is undecidable. This recognition of the 


limitations and of the scope of the axiomatic method must be regarded as one of the most important 
results of research on the foundations of mathematics. 


16.1. Groups and semigroups 343 
16. Groups and fields 


16.1. Groups and semigroups ........ 343 =: 116.2. Fields and algebraic equations ... 349 
GIOUDS: eir3e 584% Gen cae es bose 343 Fields and integral domains ...... 349 
Homomorphisms ............404:- 345 Galois theory ....... cc cece ene 352 
Finite groupS ......0. cece eeeeee 347 Applications ...... 00.0 c cee 356 
Topological groupS ..........4.5. 348 
RY 21,11 440))] 349 


16.1. Groups and semigroups 
Groups 


Sets of elements or objects of which any two can be combined according to a specified rule and in 
a particular order to obtain a third clement of the set occur coven in all branches of mathematics. 


Examples: Ordinary addition and multiplication are operations on the sets of integers, rational 
numbers, real numbers and complex numbers. Matrix multiplication is an operation on the set 
of all (# x m)-matrices, on the set of (m < m)-matrices with non-zero determinant, and on the 
set of (n * n)-matrices whose determinant is 1. 

An operation can be defined on the set of permutations of a certain fixed number of objects, 
by defining the product of two permutations to be the permutation obtained by performing 
one after the other (this is a special case of the composition of mappings). For the permutations 

1234 1234 234 
Py = € 31 i and p; = (, 13 >) the product is p, -p2= (; 34 2) sas can be seen by the 
following scheme, in which the effect on the objects is made clear 


(bet) (eS) -(D 


The product of two permutations of a fixed number of objects is another permutation of those 
objects. 
12 ..07r ..%n : ' 
ae 4 of n objects can also be written as a product of cycles 
by writing the image under P of each element after the element. The element i, which follows r 
must itself occur in the top row and is then followed by its image i;. The next step gives a further 
element i,’ and so on. This process breaks off after finitely many steps when the element r is reached 
again. This cycle can have at most n elements. If it does not contain all the elements, a new cycle 
is started; if i, = r, the cycle can be written (r), but is usually omitted from the product. 
weerre) dB= (ia si246 


A permutation P = ( 


For example, the permutations A = ( can be written 


2417653 ~\7351246 
: 1234567 
asA=(1 2 47 3)(5 6)and B=(1 7 6 4)(Q 3 5). Their products are AB= C= 


=(1375462andBA=D=(, 76537 5)— 1367542), 


The preceding examples are not all of the same kind. Some of the sets are infinite, and some 
finite, and closer examination of the operations reveals further differences. An operation on a set 
is called associative if for any three elements a, b, c, of S one has (a b)c = a(b c) (if the operation 
is written as multiplication) and (a+ 6)+c=a-+ (6+ c) (if it is written as addition). The 
operation is called commutative if for any two elements one has ab = ba or at+b=b+4a, 
respectively. It is easy to check that the multiplication of matrices and permutations are examples 
of associative operations. The multiplication and addition of numbers are associative and com- 
mutative. However, multiplication of permutations and matrices is not commutative as is shown 
by the example above where AB + BA. 

An element e of a set S is called a neutral element of an operation if for all elements a of S ap- 
plication of the operation to e and a in either order gives a. If the operation is written as multiplica- 


344 16. Groups and fields 


: ; . ; 1 ; : 
tion, e is called a unit element and ea = ae = a. For example, the permutation ¢ : ; ;) is a unit 


element in the set of permutations of four objects, and the matrix ( 0 i} is a unit element in the set 


of (2 x 2)-matrices. The same is true for the number | in the sets of integers, rational numbers, 
real numbers, and complex numbers. If S is a set with a multiplicatively written operation and has 
a unit element e, then an element a’ € S is called an inverse of an element ae 5S if a’a = aa’ = 


4 
The element a’ is denoted by a~*; for instance, the inverse permutation to p = a iE . " 


is p>* = aa é ty: e ' ) in cycle notation, p = (1 2 4) and p~* = (1 4 2). 
By a ited of abstraction from numerous leet one obtains the concept of a group. 


It is very easy to see, by using (II), that the unit element e in (III) and the inverse element a~* in (IV) 
are uniquely determined. 

If the operation is also commutative, the group is called commutative or Abelian, in honour of 
N. H. ABEL (1802-1829). 

The use of the multiplicative notation for the group operation is purely a matter of convention 
and does not say anything about the nature of the operation. One could just as well use the additive 
notation. In that case the unit element is called the zero element and the inverse is called the opposite 
element. As a rule the additive notation is reserved for Abelian groups. 

Following the concepts of set theory one distinguishes between finite and infinite groups. The 
number of elements in a group is called its order. For finite groups one can write out the multiplica- 
tion table. This is a square array in which the element in the ith row and the /th column is the product 
of the ith group element and the /th group element. Examples are the tables given in the section 
on subgroups. 


Examples of groups: The integers, the rational, the real, and the complex numbers form infinite 
Abelian groups, with addition as the group operation. The non-zero rational, real, and complex 
numbers form infinite Abelian groups under multiplication (for these groups the operation is 
written multiplicatively, even though they are Abelian, to prevent misunderstandings). 


The (” xX m)-matrices with non-zero determinants, and those with determinant 1 form infinite 
non-Abelian groups under matrix multiplication. The first is called the general linear group GL(n) 
and the second the special linear group SL(n). 


Permutation groups. The permutations of a certain fixed number 7 of objects form a finite group 
under the multiplication defined above, the symmetric group S,,. Its order is n! For n > 3 it is non- 
Abelian. 

12.4060” 


If p= ( j j is a permutation, then the number of inversions counts how many times a 
1 2 eee n 
larger number occurs before a smaller one in the sequence i, , iz, ..., i,. If the number of inversions 
is even, the permutation is called even, otherwise odd. 
12345 
Example 1: The permutation p = ( 
3 before 2, and 1, and 5 before 2. 


S, splits into n!/2 even permutations and n!/2 odd ones. 


The product of two even permutations is even. The product of two odd permutations is also even. 
The product of an odd and an even permutation is odd. 


4315 >) has 6 inversions: 4 comes before 3, 2, and I, 


On the basis of this fact a sign is introduced for permutations: sgn p = +1 if p is even, and 
sgn p = —1 if p is odd. It follows that the sign of the product of two permutations is the product 
of their signs. It is now easy to see that the inverse of an even permutation is even, because the 
identity permutation is even. Thus, the even permutations form a group of order 7!/2, called the 
alternating group A,. 


Subgroups. A subset H of a group G is called a subgroup if H forms a group under the same 
operation as on G. By this definition the group itself and the unit subgroup, containing only the 
unit element, are subgroups of G. All groups having only one element are called trivial, and all 
subgroups of a group G other than G itself are called proper. 


16.1. Groups and semigroups 345 


Some of the groups mentioned in the introduction are subgroups of others. Thus, the additive 
group of the integers is a subgroup of the additive group of the rational, and this, in its turn, is a 
subgroup of the additive group of the real numbers. The multiplicative group of the non-zero 
rational numbers is a subgroup of the multiplicative group of the non-zero real numbers. The 
alternating group A, is a subgroup of the symmetric group S,. The special linear group SL(”) is a 
subgroup of the general linear group GL(7). The following proposition follows immediately from 
the definition of a subgroup: The intersection of a family of subgroups is a subgroup. 

If a is an element of a group G, there are subgroups containing that element, for instance G 
itself. The intersection of all these subgroups also contains a, by definition, and so is the smallest 
subgroup containing a. It is called the cyclic subgroup generated by a and written <a>. Clearly, <a> 
consists of all the powers a”, n € Z (negative powers are powers of the inverse; powers are, as usual, 
products of an element with itself). 

If all the powers a” are distinct, then <a> is called an infinite cyclic group. Otherwise there exists 
a smallest positive integer 1 such that a” = e, and <a) Is a cyclic group of order n: <a> consists of 
the elements e, a, ....a"~! = a~!; a"+! = a etc. A group that coincides with one of its cyclic sub- 
groups is called cyclic. 

The set-theoretical union U, ~ U2 of two subgroups of a group is not, in general, itself a subgroup, 
as can be seen by the example below. The union is merely a subset of G; but to every subset S of G 
one can define the intersection of all subgroups containing S (which itself contains S and is the smallest 
subgroup containing S) as the subgroup <S> generated by S. The subgroup <U,, U2» is then the 
smallest subgroup containing both U, and U,. If <S> = G for a subset S of G, one says that G 
is generated by S. 


Example 2: The elements of $3 in cycle notation are p, = (1), pz = (1 2 3), p3 = (1 3 2), 
Ps = (1 2), ps = (1 3), and pe = (2 3). The group table is: 


Using the group table it is easy to check that the sets 
A=({p,, ps}, B={P1,Ps}, C={pP1,P6}, and D 
={P1,P2, Ps} are subgroups of $3. Further A = <p,), 
B=<ps>; C=<¢ Ps> are cyclic of order 2 and 
D = <p2> = <p > is cyclic of order 3. The union of 
A and D is not a subgroup, for then the product 
Ps = P3P4 would lie in {p,, p4} + {p1, P2,P3}, which 
is obviously not the case. The group generated by the 
union is the whole of S;. 


Homomorphisms 


Homomorphism. The concept of a homomorphism occupies a central position in the whole of group 
theory. It is characterized by two statements. One relates to the sets of elements of the groups and 
the other to the group operations. 


Here the sande on the left i 1S akon in G wid on the right i in G’. The in image ba G ander fi iS always 
a subgroup of G’. If there is a surjective homomorphism of G onto G’, that is, if every element of G’ 
occurs as an image under f, then G’ is called a homomorphic image of G. It is perfectly possible 
that under a homomorphism distinct elements of G are mapped to the same element of G’. It is 
not a requirement that homomorphisms should be injective. 


Example ]: Let G be the group of all real (2 x 2)-matrices with non-zero determinant, and 
G’ the multiplicative group of non-zero real numbers. The mapping taking each matrix to its 
determinant satisfies the requirement (H) and is a homomor- 
phism. What is more, it is surjective, because every real 


l 
number r is the determinant of the matrix (, ). 
Fr 


Example 2: The mapping taking every permutation pe S, to its sign sgn p is homomorphism 
of $, into the multiplicative group of real numbers, by the multiplication rules for signs: 
sen (p, * p2) = sen p, * sgn p;. Its image is the subgroup consisting of the numbers +1 and —1. 


The condition (H) means that in a certain sense a homomorphism must preserve the structure 
of the original group. The image is in general ‘smaller’ than the original group. For instance, all 


346 16. Groups and fields 


; Aa hb : : 
matrices of the form , where A and yu are numbers with Au = 1, have the same determinant 
a b\\#Hce -# 
as the matrix ( ay The set of those elements that are mapped to the unit element of the image 
Cc 


is a measure of the shrinkage of G. These elements form a special kind of subgroup of the original 
group and are called the kernel of the homomorphism. The kernel of the homomorphism in Example 1 
is the special linear group SL(2) of (2 x 2)-matrices, and the kernel of the homomorphism in 
Example 2 is the alternating group A, of all the even permutations in S,. 


Isomorphisms. 


If fis an isomorphism of G onto G’, then its i image i is 6’, and its kernel is the unit subgroup of G. 
It can be shown that these two conditions are sufficient for f to be an isomorphism. The inverse 
map from G’ to G also satisfies (H) and is thus also an isomorphism. If there is an isomorphism 
from G to G’, then the groups are called isomorphic, in symbols: G = G’. 


Peres 3: Let Vs be the Klein four group consisting of the permutations e = (1),a = (1 2) (3 at 
1 
=(1 3)(2 4) and c=(1 4) (2 3). Let G be the group consisting of the matrices e’ 


' 


0 1 
e—e 

a’ — . hh’ = P # — ud : " tan : . ; : —* (j 
a ( 0 i) ( , ‘) and ¢ te 0) Define the bijective mapping f raga 
| 4 = - c c’ 


From the group tables of the two groups Vs 
and G one can read off that the relation (H) 
is satisfied for arbitrary elements of V4. 
Thus, V4 is isomorphic to the group G of 
| matrices. 


As can be seen from the example, isomorphic groups have identical structures, even though their 
elements may be of completely different kinds, here permutations and matrices. Homomorphisms 
and isomorphisms are not restricted to finite groups. Isomorphic groups always have the same 
cardinal number. 


Example 4; Let R” be the multiplicative group of positive real numbers, and R* the additive 
group of all real numbers. The mapping f: a + Ina taking every positive real number to its 
natural logarithm is an isomorphism of the two groups; as is well known, In (a: 5)=Ina+In 65, 
in other words, f(a + 6) = f(a) + f(b), so that R* = Rt. 


Isomorphism is an equivalence relation between groups, so that the class of all groups is partitioned 
into isomorphism classes. Emphasizing that one is only interested in the isomorphism class of a 
group and not in its particular representation, one speaks of an abstract group. 


Isomorphic groups have the same structure, and calculations in them follow the same laws and 


rules, even though they may have different kinds of elements, and the operations may be defined in 
different ways. 


Normal subgroups. It has already been mentioned that kernels of homomorphisms form a special 
type of subgroup. A subgroup WN that occurs as the kernel of a homomorphism of a group G into 
some other group is called normal. Thus, in Example 1 (of the section on homorphisms) the special 
linear group is a normal subgroup of the general linear group. In Example 2 the alternating group 
is a normal subgroup of the symmetric group. A group whose only normal subgroups are the whole 
group and the trivial one (these are always normal) is called simple. A homomorphism from a simple 
group G onto a group H is either an isomorphism, or AH is trivial. The following result on permutation 
groups is mentioned here because of its applications in Galois theory. 


For n > 4 the alternating group A, is the only non-trivial proper subgroup of S,. For n > 4 the 
alternating group A, is simple. 


A proper normal subgroup M of G is called maximal if for any normal subgroup N of G with 
MCNCG, either M = N or N= G holds. The Klein four group V4 is a maximal normal sub- 
group of A4. 


Factor groups. If S is a subset of a group G and a is an element of G, the set aS is defined to be 
{as | sé S}. A similar definition is made for multiplication on the right. If H is a subgroup of G, 
the sets a H for a € G are called the left cosets of H in G. It is easy to show that they form a partition 


16.1. Groups and semigroups 347 


of G. Naturally the same definitions can be made for right cosets, and they also form a partition of 
G. However, these two partitions are not, in general, the same. Now if f is a homomorphism of G 
and its kernel is N, then N has the remarkable property that for any a in G the cosets aN and Na 
are identical, because they both consist exactly of those elements of G that are mapped by the 
homomorphism f to the same element as a. This property is so important that for the moment 
it will be ‘given a pel ca name. 


Since the left and right cosets of invariant subprous are equal, the words left and right can ie 
omitted. As they form a partition, two cosets are either equal or they have no elements in common. 
Generalizing now the product notation still further and defining for subsets S and 7 of G the product 
ST = {st|seS and te T}, then for subgroups H of G the relation HH = H holds. For invariant 
subgroups N one can go even further. If aN and bN are two cosets then (aN) (bN) = (aN) (Nb) 
= a((NN) b) = a(Nb) = a(bN) = (ab) N. Thus the product of two cosets of an invariant subgroup 
is a coset and contains the products of elements of the two cosets. One can now verify that the 
cosets of the invariant group N actually form a group under this multiplication, whose identity 
element is eN = WN, and in which the inverse of aN is a~1N. What is more, from the equations above 
the mapping 7: a-> aN, that takes every element of G to its coset is a homomorphism, which is 
called the canonical homomorphism. The kernel of 2 is, by its very definition, the invariant subgroup 
N. The concepts normal and invariant subgroup are identical. The group of the cosets of N is called 
the factor group or quotient group of G by N, and is denoted by G/N. 


Example 5: The factor group of S, by A, consists CU tha slements 4, and pod, where po is an 


odd permutation. The mapping 2: p+ {Par Bi eee is odd, is easily seen to be the canonical 

homomorphism 2 from S, to S,/A,.  '? otP 

The homomorphism theorem. If f is a homomorphism from G onto a group G’ (that is, f is sur- 
jective), then there is a natural mapping from the cosets of N to G’, because all the elements of a 
coset have the same image under f. The map 9 taking the coset aN to the common image f(a) of its 
elements is an isomorphism of G/N and the homomorphic image G’. This is the content of the 
homomorphism theorem (Fig.). 

Homomorphism theorem. Every surjective homomorphism of a group G onto a f 
group G' can be split into the product of the canonical homomorphism 2 from 
G onto G/Ker f and an isomorphism g of G/Ker / onto G’. i de 


16.1-1 The homomorphism theorem ener 


The homomorphism theorem implies that the investigation of arbitrary homomorphisms can 
be reduced to the investigation of factor groups, isomorphisms, and subgroups. 


- aaerinle S pale Sart Peacaraeerbeee yo apie Ltd pon FS 8 homcass wiee 


the group t41. og 


The relation between maximal normal subgroups and simple groups is given by the following 
theorem. 


A normal subgroup N of a group G is maximal if and only if the factor group G/N is simple. 


Automorphisms. An automorphism is an isomorphism of a group G onto itself. The product of 
two automorphisms of a group G is again an automorphism of G. The identity mapping of G onto 
itself, which leaves every element fixed, is an automorphism, and if f is an automorphism of G 
onto itself, so is the inverse mapping f—!. The automorphisms of G form a group under the product 
of mappings as operation. It is called the automorphism group of G. 


Finite groups 


The theory of finite groups was originally developed as a tool for dealing with the problem of 
solving algebraic equations, and dealt at first only with permutation groups, that is, subgroups of 
symmetric groups. The importance of these groups is apparent from the following two theorems. 


348 16. Groups and fields 


Cayley’s theorem. Every group of order » is isomorphic to a subgroup of the symmetric grou 

Lagrange’s theorem. For any subgroup H of a finite group G, the index [G: H] of Hin Ge amma 
as the number of left cosets of H in G. If E is the unit subgroup, then |G: £| is the order of G. Now 
for any subgroup H of G the order of H divides the order of G. More precisely: 


[G: E] = |G: A] (A: E). 


The order of an element a of a group G is the order of the cyclic subgroup generated by a. Obviously, 
if G is finite, then every element has finite order, indeed by Lagrange’s theorem its order divides 
the order of the group. There are, however, infinite groups, whose elements all have finite order, 
for example, the multiplicative group of all complex roots of unity, that is, the group of all solutions 
of the equations x” — 1 = 0, where n= 1,2,... W. BURNSIDE raised the problem whether a 
group in which all elements have finite order and which is generated by finitely many elements is 
necessarily finite. The question was only settled (in the negative) in 1967 by Novikov and 
ADYAN. 

In the theory of solutions of algebraic equations the concept of a composition series of a group 
plays an important role. A composition series of a finite group G is a sequence of subgroups 
G = Go D G; © G2 ::: D G, = E, each containing the next, such that each group is a maximal 
normal subgroup of its immediate predecessor. The simple groups Go/G,, G,/G2, ..., G;_2/G,_1, 
G,_,/E = G,_, are called the composition factors. Groups with composition factors of prime order 
are called soluble, because of their relation to the solubility of algebraic equations. The composi- 
tion factors of a finite group are uniquely determined up to isomorphism and the order in which 
they occur (Theorem of Jordan-Hoélder). 


Applications. Apart from the applications of group theory to geometry, one can say roughly 
that groups play a role anywhere where mappings, transformations, and symmetries (in some sense or 
another) occur, and concepts are investigated that are invariant under the mappings, transformations, 
or symmetries. In particular, the theory of finite groups is applied in the theory of solutions of 
algebraic equations (see Galois theory). In physics group theory is important in relativity theory, 
where the group of Lorentz transformations is used, and in quantum mechanics. The division of 
physics into relativistic and non-relativistic physics, is a division by group-theoretical criteria. In 
crystallography group theory makes it possible to determine all possible crystal forms by investi- 
gating their symmetry groups. It is perhaps also of interest that all two- and three-dimensional 
ornaments can be classified by means of group theory. 


Topological groups 


Many of the groups that are important in applications, particularly in geometry and physics, 
are infinite and carry, apart from their algebraic structure, also a structure as a topological space. 

A topological group is a set of elements that is, on the one hand, a group and, on the other, a topo- 
logical space, such that these two structures are compatible, in the sense that multiplication and 
inversion are continuous mappings. Examples are various matrix groups, the transformation groups 
of the several branches of geometry, and the Lorentz group. 

For instance, it makes sense to ask whether two (2 < 2)-matrices with real entries are only slightly 
different, whether their entries are close together. This leads to a topology for (2 X 2)-matrices 
derived from the topology of the real line. It can be shewn that in this topology multiplication of 
matrices is continuous. Thus, the general linear group GL(2) of (2 x 2)-matrices with non-zero 
determinant is a topological group. While the fact that topological groups are always infinite makes 
it harder to analyze them, their topological structure makes it possible to use new, not necessarily 
algebraic, methods, which have Jed to excellent results in the theory of Abelian topological groups 
and in the theory of compact topological groups. The interplay between topological and algebraic 
ideas gives this branch of group theory its particular attraction. 


Lie groups. The rotations of the plane about a fixed point form a group. (a Tt of 
Each rotation is determined by its angle g of rotation and can be described ‘\5!"¥ Cosy 
by a matrix of the adjacent form. It can be shown that these matrices form a group. Since 9 varies 
continuously between 0 and 22, a topology can be defined on the group. But this group differs from 
other topological groups in that its elements are dependent on a parameter, and that this depend- 
ence is described by differentiable functions. This makes it possible to define not only continuous 
but also differentiable functions on the group, by calling a function differentiable if it is differenti- 
able as a function of the real parameter y. A topological space on which differentiable functions 
can be defined is called a differentiable manifold, and the group of rotations about a fixed point 
is not just a topological space, but actually a differentiable manifold. Such groups are called Lie 
groups, after Sophus Lig; they form a special part of the class of topological groups and are in 
many respects easier to deal with than arbitrary topological groups, because the tools of analysis 
can also be utilized in their investigation. 


16.2. Fields and algebraic equations 349 


Applications. Lie groups and their representations (see Chapter 33.) are particularly important 
in the theory of special functions (spherical functions, Bessel functions etc.) and in the theory of 
almost periodic functions. Li£ used his theory to classify and solve differential equations. In quantum 
theory the LIE group of rotations of a sphere and the Lorentz group, which is also a Lik group, 
play an important part. 


Semigroups 


A semigroup is a non-empty set with an associative operation. If the operation is also commutative, 
the semigroup is called commutative. If H is a multiplicative semigroup and contains an element e 
such that ea = ae = a for all elements a of H, then His called a semigroup with unit element. Examples 
of commutative semigroups with unit element that are not groups are the integers under multiplica- 
tion and the non-negative integers under addition (in which 0 is the unit element). Naturally every 
group is a semigroup. Theorems that are true for all semigroups are also true for all groups. By 
analogy to groups one distinguishes finite and infinite semigroups and calls the number of elements 
of a semigroup its order. If the equations ax = 6b and ya = b each have at most one solution for 
arbitrary pairs of elements a and 5 of a semigroup H, then 4 is called regular. The non-zero integers 


form a regular semigroup under multiplication. (A regular semigroup of finite order i 4 group. 
For finite semigroups the following theorem holds: 
Example 7: hal gyrA esx poorer eso Seah minal (t+) t del 
ph cls ale ccderhegeetn von tale AE | 
Pics fs Pos The power set of an arbitrary set (that is, all sul 
The wholoeit os tie un anes eS a erates 


16.2. Fields and algebraic equations 


Up to the beginning of the nineteenth century algebra could be described as the theory of solutions 
of algebraic equations. Its purpose was to find methods, as general as possible, of computing such 
solutions. The actual solutions were of less interest than the methods used in finding them. The 
ideas of nineteenth century mathematicians in connection with these problems led to the definition 
of groups and fields as essential tools in the theory of solutions of algebraic equations. Later these 
concepts attained an interest and importance of their own, principally because applications were 
found in quite different areas. Group theory and field theory are now extensive branches of algebra. 
Further objects with similar structures were also found in many parts of mathematics and this led 
to the definitions of such algebraic structures as rings, algebras, lattices, and integral domains 
(see Chapter 33.). 


Fields and integral domains 
Fields. A field is a set K of elements satisfying the following field axioms: 


The patina numbers. “the real numbers, and the complex numbers are the most important 
examples of fields. Intuitively One can Say that a field is a set of elements in which one can do arith- 
metic in the usual way. Just as for groups, one distinguishes finite and infinite fields. A subset P 
of a field 2 is called a subfield if it satisfies the field axioms for the operations defined in £2, in par- 
ticular, sum and product of elements in P must lie in P, and so must negatives and inverses of elements 
of P. The field 22 is also called an extension field of P. If K is an extension field of P and also a sub- 
field of 2, then K is called an intermediate field between P and 22: PO KGQ. 

Every field can be regarded as a vector space (see Chapter 17.) over any subfield P as set of 
scalars, by taking field addition as the vector space addition and field multiplication with elements 
of the subfield as multiplication by scalars. The dimension of $2 over P regarded as a vector space 
is called the degree of the extension field 2 over P. If 2 is a finite-dimensional vector space over P, 
the extension is called finite. Its degree is denoted by n = [2: P]. In this case one can find elements 
Bi, Bo, ..-, B, in 9 such that every element B € 22 can be written uniquely in the form 
B= c,B, + coB2 +--+ + Cabs with elements c;,C2,---, Cx Of P. Such elements 6,,f2,---,Bn are 
said to form a basis of 2 over P. 


350 Groups and fields 


If §2 is an extension field of P and «,, «2, ..., %» are arbitrary elements of 22, then P(a,, «2, ..., m) 
denotes the smallest subfield of £2 containing P and «,,«2,...,%,,. It consists of all the elements 
of 2 that can be obtained from P and «, , «2, ..., %,, by means of the elementary arithmetical opera- 
tions. One says that the field P(a,, «2, ..., %m) is obtained from P by adjoining x1, 2, ...,%m to P. 
It can be obtained by first adjoining «, to P to obtain a field Ky = P(a,), then adjoining «, to K, 
to obtain K, = K,(«2)= P(a,, m2), and so on. After m steps one has K,,, = Kmy_ 1(Xm_)= P(X, , %2, «--, Xm). 
An extension field P(«) that is obtained by adjoining a single element is called a simple extension 
of P. 

If K, and K> are fields, then a bijective mapping f from K, onto K, such that f(a + 5b) 
= f(a) + f(b) and f(ab) = f(a) f(6), for arbitrary elements a and b of K,, is called an isomorphism 
of K, onto K,; K, and Ky are called isomorphic, and one writes K, = K,. An isomorphism of a 
field K onto itself is called an automorphism of K. If K, and K, are fields and if P is a subfield of 
K, and K,, then an isomorphism of K, onto K, leaving P elementwise fixed is called a relative 
isomorphism. For K, = K, one speaks of a relative automorphism. Numerous examples of these 
concepts can be found in the following sections. 


Integral domains. The field @ of rational numbers has a subset satisfying the field axioms 1, 2, 
and 4, but not axiom 3: the integers. In the set of integers division is not universally possible, but 
at least ney a the weaker cancellation es If sds = acanda bai 0, then b= c = sida 


All fields are ctenral douaink: The integers are an example ofa an iieral domain that is not a 
field. A further important example of an integral domain is the set of polynomials with coefficients 
in a field (see Polynomials). If the set J is finite, then J is called a finite integral domain. From the 
theorem on finite regular semigroups mentioned 
above one obtains at once: 

Two integral domains are called isomorphic (I, = I,) if there is a bijective mapping f from /, 
to I, that is compatible with the operations in the same sense as a field isomorphism. The mapping 
is called an isomorphism (see Field of fractions). 


Examples: Apart from the fields and integral domains already mentioned, the Gaussian numbers 
form a further important example. They consist of all complex numbers of the form a + bi, 
where a and 6 are rational numbers and i? = —1. The subset of Gaussian numbers for which a 
and 6 are integers is called the set of Gaussian integers and is an integral domain. The Gaussian 


jumbers ar btained from the rationals by adjoining i. 
eho aaalee scwiets onde With baLRCS SUG 


adjacent addition and multiplication tables. | | 0 e 
The fields of residues modulo a prime number are also impor: 0 : 10 4 
tant examples (see Chapter 31.). —— 


Polynomials. An expression of the form f(x) = agx” + agi xP) + + ayx + ao, in which 
n is a natural number and do, ..., @, are elements of a field K, is called a polnonial in the indeter- 
minate x over K. The quantities do, ..-, Q, are Called its coefficients. A polynomial whose coefficients 
are all zero is called the zero polynomial. If the coefficient a, is not zero, then the degree of the poly- 
nomial is n. If the indeterminate x is replaced by a field element, one obtains a function defined 
on K with values in K. Fhe functions are called polynomial functions or integral rational functions 
on the field K; conversely, if the field is infinite, each integral rational function determines a unique 
polynomial. In dealing with polynomials it is frequently convenient to use the simplified notation 


f(x) = 7 a,x', where it is understood that only finitely many a, are amen from zero. One can 

then Write the sum and product of polynomials f(x) = > a,x' and g(x) = 2 bj;x/ as f(x) + g(x) 

= i c,x* i S(x): g(x) = iz d,x*, where the coefficients and d, satisfy the ¢ equations Cc. =a, + by 
=O 


and a = 2 a,b,. Polynomials with coefficients in a certain fixed field K form an integral domain, 
=k 


which is cenciea by K[x]. In this integral domain K[x] one can perform division with remainder 
just as for the integers. This means that if f(x) and g(x) with g(x) + 0 are two given polynomials, 
there exist unique polynomials A(x) and r(x) such that f(x) = A(x) - g(x) + r(x), and r(x) = 0 or 
the degree of r(x) is less than that of g(x). By analogy to number theory (see Chapter 1.) one 
calls two polynomials f,(x) and f2(x) congruent modulo g(x) and writes f(x) = f2(x) mod g(x) if 
they leave the same remainder r(x) on division by g(x). Congruence modulo g(x) is an equiva- 
lence relation and leads to a partition of the integral domain K[x] into classes. These classes are 


16.2. Fields and algebraic equations 351 


added and multiplied by choosing a polynomial from each class, adding or multiplying the poly- 
nomials, and defining as sum or product of the classes the class of the sum or product of the polyno- 
mials. It can be shown that the same class is always obtained even if the polynomials chosen from 
the original two classes are changed. If g(x) is an irreducible polynomial, these classes, which are 
called the residue classes modulo g(x), form a field, which is denoted by K[x]/(g(x)) and is called 
the residue class field of K[x] modulo g(x). 

A non-constant polynomial is called irreducible over K if it cannot be written as a product of two 
polynomials over K of smaller but positive degree. A polynomial of degree 7 is called monic if its 
highest coefficient a, is 1. The following theorem holds for polynomials over a field K: 


Every polynomial f(x) over KX has a representation f(x) = cp,(x) --- p,(x) as a product of a field 
element c and irreducible monic polynomials p, (x), ..., p,(x) in K(x). This representation is unique 
up to the order of the irreducible polynomials. 


Field of fractions. The concept of a field of fractions arises from the question, ‘What is the smallest 
field containing a given integral domain?’ The question can be reformulated: ‘Given an integral 
domain J, is there a field K containing J, whose elements can all be written as fractions of elements 
of J?’ — To obtain an idea of such a field one first assumes that it exists. One can operate on the frac- 
tion a/b in the usual manner. The elements a and } + O are in J and so: (1) a/b = a’/b’ if ab’ = a’b, 
(2) a/b + c/d = (ad + bc)/bd, (3) (a/b) - (c/d) = ac/bd, where, of course, all the denominators must 
be different from zero. If the field exists, the rules (1), (2), and (3) must hold. One uses these rules 
to construct a field K, by first replacing the required fractions by ordered pairs (a, b) of elements 
a and 6 + 0 of J. An equivalence relation is defined on these pairs by (a, b) = (a’, b’) if ab’ = a’b 
in I. The classes are denoted by square brackets []. Addition and multiplication of the classes are 
defined using the rules (2) and (3): by (2) [a, b] + [c, d] = [ad + bc, bd] and by (3) [a, 5] - [c, d] 
= [ac, bd]. One checks that these operations are independent of the particular representative of 
the class (a, b] etc. chosen, and that the set of classes K’ forms a field under this addition and multi- 
plication, in which every element [a, b] can be represented as a fraction [a, e]/[b, e], where e is the 
unit element of J. It can now be shown that the elements [a, e] of K’ form an integral domain I’ 
that is isomorphic to J by the mapping [a, e] > a. Thus, a field of the required type has been con- 
structed; however, it does not contain J itself, but an integral domain isomorphic to J. Now if the 
elements of J’ are replaced by their isomorphic images in J, and the elementary operations are 
redefined suitably, one obtains a field K containing J, in which every element can be repre- 
sented as a fraction a/b of elements of J. This field is called the field of fractions of the integral 
domain I. It is the smallest field containing I. If I is the set of integers, then this process coincides 
with the one used to construct the rational numbers (see Chapter 3.). 


The field of fractions of the integers is the set of rational numbers. 
If J is the integral domain of polynomials K[x] over a field K, its field of fractions is called the 
field of rational functions over K and is denoted by K(x). 


Algebraic equations and field extensions. An algebraic equation of degree n is an equation f(x) = 0 
whose left-hand side is a polynomial of degree 7. In particular, the equation is called irreducible 
over K if the polynomial f(x) is irreducible over K. The solutions of the equation f(x) = 0 are 
called the roots of the polynomial f(x) or of the equation f(x) = 0. In general, algebraic equations 
can only be solved in an extension field; for example, the need to define a root for every quadratic 
polynomial has led to a considerable enlargement of the number system: the complex numbers. 
Frequently a smaller extension is sufficient. For instance, the coefficients of the equations x? — 2 = 0 
and x* + 4 = 0 lie in the field of rational numbers and their solutions lie in the field @Q(/2) and in 
the Gaussian number field, respectively. If f(x) = 0 is an irreducible equation over K of degree 
greater than 1, then it has no solution in K and an extension is necessary to find a solution. 

If « is a solution of an irreducible equation f(x) = 0 over K, then « is called algebraic over K. 
If £2 is an extension field of K and every element of §2 is algebraic over K, then 2 is called an al- 
gebraic extension of K, and an algebraic extension of Qis called a number field. Extensions that are 
not algebraic are called transcendental. If f(x) = 0 is an irreducible equation with coefficients in a 
number field K, then by the fundamental theorem of algebra (see Chapter 4.) it has a solution « in the 
field of complex numbers. The field K(x) obtained by adjoining « to K is the smallest subfield of the 
complex numbers containing K and the root « of the equation. It is not possible to construct a 
suitable extension of this type for the general equation of degree n, whose coefficients are unknowns 
in the meaning of algebra, because there is no field available that must contain at least one solu- 
tion of the equation. Since even in the case of a particular equation the fundamental theorem of 
algebra only asserts the existence of a solution in the field of the complex numbers and gives no 
method of constructing it, it seems plausible in all cases to try and find a general method of con- 
structing an extension field containing a root of a given equation, a so-called root field. 


352 16. Groups and fields 


Construction of a root field. Let f(x) be a monic irreducible polynomial of degree 1 with coef- 
ficients in a field K. The residue class field K[x]/(f(x)) contains a subfield K, consisting of the con- 
gruence classes d of the elements a of K. The class @ of a consists of all polynomials leaving the 
remainder a on division by f(x). The mapping a > 4d is an isomorphism of K onto K: K = K. If K 
is replaced by K in K[x]/(f(x)) leaving all the operations and relations the same, one obtains a field 
K’ containing K and isomorphic to K[x]/(f/(x)). Now let f(x) = x" + a,_,;x""1 +--+» + do and let 
« be the class of all polynomials leaving the remainder x on division by f(x). Then «" + a,_,«"-} 
+ --- + dp is, by the rules of addition and multiplication of residue classes, the class of f(x), but 
this is the class of all polynomials leaving the remainder 0 on division by f(x), and in K’ this has 
been replaced by 0 itself. Thus, «” + a,_,«"-1 + --- + a,x + ay = 0, and « is a solution of the 
ee f(x) = 0. The field K’ contains a solution of the equation f(x) = 0 and is called a root 

eld of it. 

The root field K’ = K(x) is a simple algebraic extension of K, and every element of K’ can be 
written in the form bp + b,x + --- + b,_,«"-1, where bo, b,, ..., b,_; are elements of K. The degree 
of the extension K(«) over K is equal to the degree of the irreducible polynomial f(x): [K(«): K] 
= degree (f(x)) = n. 

Example 1: A root field obtained for the equation x? + 1 = 0 over the rational numbers @ 
is the simple extension @(i) in which every element can be written a + 51 where a and 6 are rational 
numbers and i? + 1 = 0. This field is isomorphic to the field of Gaussian numbers. 


Splitting fields. Let f(x) be a monic polynomial of degree 1 with coefficients in a field K. The 
splitting field of f(x) over K is the smallest extension field L of K over which f(x) splits into linear 
factors: f(x) = (x — a) (x — «2)--- (x — a,), where «1, 2,...,%, are the roots of f(x) in L. 
The splitting field of a polynomial or algebraic equation over K is the smallest extension of K con- 
taining all the solutions of the algebraic equation. It is unique up to isomorphism. One can construct 
the splitting field by repeated applications of the construction of a root field. First f(x) is factorized 
into irreducible polynomials over K. If all the irreducible factors have degree 1, then K itself is the 
splitting field. If not, one constructs a root field K’ for one of the irreducible factors of degree >1 
and factorizes f(x) into irreducible factors over K’. If all the factors now are of degree 1, then K’ 
is the required splitting field. Otherwise one chooses a factor of degree >1, and constructs a root 
field K’’ over K’ by the same process. Again f(x) is factorized into irreducible polynomials over K”’, 
and so on. As the degree of at least one irreducible factor is reduced at each stage, the process must 
end with a splitting field of f(x). 


Example 2: Consider the polynomial f(x) = x*° — 2 over Q. Its roots are a, = y2, =o 2, 
a; = OY2, where o = ‘1-1 +iy3) and @='/,(—1 —iVy3). The splitting field is 
L =Q(j/2, @ 2, & ¥2) = Q(/2, w), since @ = —1 — ow. 

Galois theory 


The Galois group and the fundamental theorem of Galois theory. The connection between solutions 
of algebraic equations and group theory discovered by E. GALoIs leads to particularly beautiful results 
in the theory of finite field extensions. For this reason the central part of field theory dealing with 
solutions of algebraic equations is called Galois theory. If N is the splitting field of a polynomial 
over P without repeated roots, then N has the important property that every relative automorphism 
of any extension L, of P containing N maps N onto itself, that is, induces an automorphism of N. 
A field extension with this property is called normal. If K is an extension of P contained in LZ, but 
not normal over P, then it is mapped by the relative automorphisms of ZL, over P onto isomorphic 
copies K’, K”, .... which arecalled the conjugates of K in L,. Therelative automorphisms of L, deter- 
mine relative isomorphisms of K over P. A normal field extension K is distinguished by the property 
that K is equal to all its conjugates. 

Let f(x) be an irreducible polynomial of degree n over a field P. Suppose that an extension L of 
P contains all the solutions «,, ..., «, of the equation f(x) = 0, and that they are all distinct. If « 
is one of these solutions, then the conjugates of the simple extension P(«) are P(x;), ..., P(«,), 
of which one is identical with P(x). The 7 relative isomorphisms of P(«) map « to «1, .--, %,. Since 
every relative isomorphism must take a root of f(x) to another root of f(x), there can be no further 
relative isomorphisms of P(«). 


This statement can be generalized. A finite extension K = P(f,, ..., 8») can always be obtained 
by adjunction of a single element: K = P(#), provided that the irreducible equations whose roots 
B,,---;Bm are adjoined have no repeated roots. If N = P(@) is a normal extension of P, then the 
number of relative automorphisms of N is equal to the degree of the extension [N: P]. The relative 


16.2. Fields and algebraic equations 353 


automorphisms of a normal extension N form a group of order [N: P] under multiplication of 
mappings. This group is called the Galois group G of the normal extension N over P: [G: E] = [N: P]. 

If K is an intermediate field between P and N, then AN is also normal over K, and those relative 
automorphisms of the Galois group G of N over P that leave all the elements of K fixed, in other 
words, the relative automorphisms of N over K, form a subgroup H of G, which is just the Galois 
group G of N over K. 

In the above manner a subgroup H of the Galois group G is associated with every intermediate 
field K. This correspondence can be inverted. If H is a subgroup of G, then the elements of N that 
are left fixed by all the relative automorphisms in H form an intermediate field K. Thus, the in- 
vestigation of the intermediate fields between N and P is reduced to that of the subgroups of the 
Galois group. The methods of group theory can now be applied to field theory. If the subgroups 
of the Galois group G of N over P are known, one is in a position to survey completely all the 
extensions between P and N, and the intermediate fields, and their relationships to one another. 
If P is a field containing the rational numbers, then the fundamental theorem of Galois theory 
holds. 


Fundamental theorem of Galois theory. Let V be a finite normal extension of P and G its Galois 
group. (1) There is a one-to-one correspondence between the subgroups H of G and the inter- 
mediate fields K of the extension. 

(11) If XK and H correspond to each other, then // consists of all the relative automorphisms of NV 
that leave the elements of A fixed; and A consists of all the elements of /V that are fixed under the 
relative automorphisms in H. 

(111) An intermediate field A is normal over P if the associated subgroup H/ is normal in G. In 
that case the Galois group of K over P is isomorphic to the factor group G/H. 


(IV) The following relations hold: 
N—=—, E 
|[H: E| = [(N: K] [N: K]—} }-— [H: E] 
K —— > H 
1G: H] = |K:P} [K:P|-—~ | 1G: H] 


Example 3: Consider the finite normal extension L = (V2, 7) V2, a 2) = Q(y2, w), the 
splitting field of the polynomial «3 — * over @. To find the. Galois group of L over a, one first 


finds all relative isomorphisms of @(y2) in L. Here: y2-> 72, 2 > w 72, or y2 -* a@y2. There- 


fore the relative automorphisms of @(j/2, w) act on y2 and w in the following six possible ways: 
These six relative automorphisms form the Galois group G of 
y2—> 2, pnw EE L over @, with multiplication defined as pas rming the two 
3 3 mappings in succession. If one notes that w @ and wm = |, 
Vy2+oy2, w-ao~A , js easy to verify the relations above, Furthermore one has: 
3 3). a = E, B* = E, and BA = A?B. The subgroups {E£, A, A*}, 
¥2+@/2, w7w~A fe B}, {E, AB}, and {E, A? B} correspond to the intermediate 


Vi+}2,. ara~s fields (o>), Q(//2), A(w Y2) and Qa V2. 
y2 +o y2, wo+@~A*B Qi, w), Q(w), Q()2), a(w 12). Q(o y2), Q 


3 3 
PS eas er ne a <A> <B> <AB> ¢A*B> G 


The Galois group of an equation. If f(x) = x" + a,_;x"-! + -+- + a,x + ap = 0 is an equation 


with coefficients in Q@, then there are certain rational relations H(«,, ..., «,) = 0 between the solutions 
1, +++, &, Of the equation; for example the adjacent equations (see Chapter 4.). 
These equations are independent of the order in gg, ta, +: +a, = —dy_1 


which the solutions are numbered. In other words: 
these relations between the solutions remain invariant 
under all permutations of the solutions. It can happen 
that there are further relations that depend on the 
particular equation in question, and that these may or may not be invariant under certain permu- 
tations. If one considers the set of those permutations that do not destroy any of the relations among 
the solutions, then the following result holds: 


The set of those permutations that leave all relations H(x,, ...,%,) = 0 with coefficients in Q 
invariant is a subgroup of S,. It is called the Galois group of the equation. 


wits — wid 2 Stef oy hk, = hor 2 


int 3 e = (—1)" do 


354 Groups and fields 


Example 4: To find the Galois group of the equation f(x) = (x? — 2) (x? — 3) = 0. The roots 
of this equation are a, = 2, «2 = —/2, a3 = V3, «4 = —J3. It is required to find all the per- 
mutations of four elements that leave all relations between these roots invariant. It is sufficient 
to consider the relations H,(«,, -.., 4) = «,%2 = 2, and H2(a,, ..., %4) = a4%4 = 3, from which 
it is easy to see that the permutations 7 


he ai 234 = (1234) he ape 
e=(1 234) ma=(54 34) Pa 4 243 Pe TS 


are exactly the elements of the Galois group G, because every permutation that takes ~, or «2 
to «3 OF a4, or vice versa, would destroy one of the relations H, or H2. 


If f(x) = 0 is the general equation of degree n, that is, f(x) = 0 is an equation with indeterminate 
coefficients, which can be arbitrarily replaced by elements of any field, then there are no further 
relations apart from the ones given above, the so-called elementary symmetric relations and their 
consequences. 


The Galois group of the general equation of degree n is the symmetric group S,. 


The Galois groups of an equation and of its splitting field. To find a connection between the Galois 
group of an equation f(x) = 0 and that of a field extension, one considers the splitting field L of 
the polynomial f(x) over Q. Let «,, ...,«, be the roots of the polynomial in L, and assume that 
they are all distinct. If A is a relative automorphism of ZL over Q, then every rational relation 
H(a,, .--,&,) = 0 with coefficients in@ is taken by A to H(Aa,, ..., Ax,) = 0. Further, since every 
relative automorphism of L takes roots of f(x) to roots of f(x), one has Aan, = «;,, Ax2 = j,,... 
AX, = %;, , say. Thus, every relative automorphism A, in other words, every element of the Galois 


group of L over Q, defines a permutation p = f . | of the roots of the equation f(x) = 
1 2 eee n 


under which every valid relation H(«,, ..., %,) = 0 goes over into a valid relation H(«,,, .-., o)= i 
and the permutation belongs to the Galois group of the equation. It can be shown that the mapping 
defined by A — p is, in fact, an isomorphism of the Galois group of the extension L over @ onto the 
Galois group of the equation f(x) = 0. 
Example 5: The Galois group of the equation f(x) = (x* — 2) (x* — 3) consists of the elements 
(see Example 4): 


_ (1234) (1234 = hoe se ope 
=(5 ona) m=(313 ‘) Pa "\1243 Fa ye 


The Galois group of the corresponding field extension @(j/2, |/3) consists of the elements e’, p}, 
p>, and Ps , which have the following effect on |/2 and 3: 


e ~ V2 y2, Vy3— y3 e-e 

py~V2> —y2, y3—~Yy3 Pr > Pi 
Pz ~ V2> y2, V3 —y3 Pa P2 
p3™y2~—y2, yv3~—y3 P3 > P3- 


It is very easy to check that this mapping is an isomorphism between the two Galois groups. 


Solution of equations by radicals. Apart from the problem of the existence of solutions of an equation 
f(x) = 0, which has been solved by the construction of the splitting field, there is also the problem 
of determining them, that is, of finding a method of giving their precise values. For quadratic equa- 
tions the formula has been known for a very long time. The formula of Cardano can be used to solve 
cubic equations, and FERRARI, a pupil of CARDANO, was able to give a corresponding formula for 
quartic equations, that is, equations of the fourth degree. These formulae all use only the four 
basic arithmetical operations and the extraction of roots. After FERRARI all attempts to find a general 
formula using only these operations for equations of the fifth or higher degrees remained unsuc- 
cessful. The ‘ power of radicals’ was greatly overestimated, as will be shown by the following argument. 
As a justification of these efforts it should be remarked that the expectation of finding corresponding 
solution formulae for the general equation of degree higher than 4 was fostered by the fact that 
one can very well find special equations whose solutions can be expressed by radicals. Such equations 
are called soluble, or soluble by radicals. A radical is a solution of a pure equation of the mth degree 


x" — a = 0 and is denoted by ja. 
A radical expression over a field P is defined in the following manner: there is an element g, E P, 
and finitely many polynomials g(x), 23(%1, 2), «++ Em(*1 re, om 1)» and g(x1, ---, Xm) and positive 


integers My vey Nm such that B = eB; 9 2°%9 Bm) with fi, = Ye: 9 B2 = Ve2(B1), fz = = Vea(Bs »B2)) erey 
Bm = V(8m(B1 grey Bm-1)). 


16.2. Fields and algebraic equations 355 


Example 6: With g, = 2, g2 = 6x} + Sx; + 3, and g(x,, x2) = (1 + 3xy)? x2 + 7x, + 2, 
and m, = 2 and m, = 4, one obtains the radical expression /: 


B= g(B:,B2) = (1 + 3 2) lO(V2)> + $y2+3])+7)2+2 
over the field of rational numbers. Here one has 


4 
Bb, =y2 and §,2=y3+5y2+ 6(//2)°]. 
An equation f(x) = 0 is called soluble by radicals over P if its solutions are radical expressions 
over P. In the language of field theory, a radical expression corresponds to a tower of fields: 


P= Ky& Ko(1) = K,c K,(62) = K,¢ K2(B3) = K3--- = m—1(Bm) = Ky, = K, 
where f is an element of K and every extension K,,; over K; is obtained by solving a pure equation 


x": — g,(B,, ---, Bi-1) = 0. In this case the field K is called soluble. Thus, in the language of field 
theory the solubility of an equation can be expressed in the following manner: 


The equation f(x) = 0 is soluble by radicals if and only if there exists a soluble field K containing 
the splitting field L of the polynomial f(x). 


It can be proved that under these circumstances the splitting field L of f(x) must itself be soluble: 


The equation f(x) = 0 is soluble by radicals if and only if the splitting field L of the polynomial 
J(x) can be reached by a tower of fields P = Ko © Ko(8,) = K, S@* © Ks.) = K, = L, 
in AE ER: K,4, = K,(6;.,) is obtained from A, by adjunction of a solution of a pure equation 
xi =; = * 

It can be shown that the existence of such a tower implies the existence of a tower in which each 
field is the splitting field of a pure equation over its predecessor. By the fundamental theorem of 
Galois theory this implies the existence of series of subgroups in the Galois group G: 

G = H, = Am-1 2: DA, > H,=E, 
in which each subgroup is normal in its predecessor, and the factor groups are isomorphic to the 
Galois groups of the extensions in the tower. These are always Abelian for pure equations, if the 
base field contains all the nth roots of unity. From this it can be shown that the Galois group has 
a composition series with factors of prime order. The converse is also true, and so one obtains the 
definitive criterion: 


The equation f(x) = 0 is soluble by radicals if and only if its Galois group is soluble. 


Since the symmetric groups of degree 2, 3, and 4 are soluble, this implies the solubility of the 
general equations of degree 2, 3, and 4 by radicals. On the other hand, the symmetric groups of 
degree 5 and higher. are not soluble, and therefore there can be no formulae for the solutions of the 
general equations of degree 5 and higher. This result was discovered independently by GALois and 
by ABEL; the latter did not, however, succeed in giving Galois’ general criterion for the solu- 
bility of particular equations by radicals. 


Cubic equations. The reduced form of the cubic equation (see Chapter 4.) is: x* + px +q=0, 
where p and q are elements of a field P. The Galois group of the equation is $3. The 
composition series $3 2 A3 = E of the Galois group corresponds to a tower of fields P& KCN. 
Furthermore one has the relations [S3: A3] = [K: P] = 2 and [A3: E] = [N: K] = 3. To make 
matters simpler, assume that P already contains the cube roots of unity. To get from P to K one need 
only adjoin a square root /D to P, where /D must remain fixed under all permutations of A3. 
From this condition one obtains /D = /(—4p* — 27q?). If one denotes the solutions of the original 
equation by «,, «2, «3 and forms Lagrange’s resolvent, r = x, + waz + Gx3, where w, @ are the 
cube roots of unity, then it can be shown that r> = 27q/2 + (3/2) /(—3D) = s lies in the field 

3 


K = P(VD). Now one obtains the extension N by adjoining r = |/s to K. The roots «1, «2, %3 
can now be computed from the equations «, + «,+03;=0, «, +wx%,+ @«3;=r, and 
Oy + Gx, + waz = —3p/r. 

Constructions by ruler and compass. The problem of constructions by ruler and compass alone 
can be formulated in the following way; from finitely many given points in the plane, to construct 
in finitely many steps a required point, where each step is of one of the following types: 

(1) The ruler may be used only to draw the line joining two given or previously constructed points. 
(2) The compass may only be used to draw a circle whose centre is a given or previously constructed 
point and whose radius is the distance between two given or previously constructed points. 

(3) New points can be constructed by intersecting two straight lines, a line and a circle, or two 
circles, that have been constructed by the rules (1) and (2). 

To obtain a survey of the points that can be constructed one translates the geometrical problem 
into algebraic language. This is done by introducing a rectangular Cartesian coordinate system 


356 17. Linear algebra 


to describe the points P,, P2, ...,.P,, in which is P,, say, the origin and the point (0, 1) is P,. If 
K is the smallest field containing the coordinates of all the given points, then it can be shown by 
means of analytic geometry that all points constructible in a single step of type (1), (2), or (3) must 
have coordinates in K or in a field obtained from K by adjoining square roots. On the other hand, 
it can also be shown that all the rational operations and the extracting of square roots can be per- 


formed by sequences of steps of the types (1), (2), and (3). Generally, one obtains the following 
important criterion: 


A point can be constructed by ruler and compass alone if and only if its coordinates lie in a finite 
normal extension field of K whose degree over K is a power of 2. 


Since in many cases the problem is to construct a quantity x that is given by an equation f(x) — 0, 
the criterion can be reworded: A quantity x can be constructed by ruler and compass if and only 
if the equation f(x) = 0 can be solved by quadratic radicals over K. 

The famous problems of classical Greek mathematics, to construct by ruler and compass alone 
a square of area equal to that of a given circle (the squaring of the circle), to divide an angle into three 
equal parts (trisection of the angle), and to find a cube of double the volume of a given cube (the 
doubling of the cube) all turn out to be unsoluble. The algebraic formulation of the doubling of the 
cube is x? — 2 = 0, where x is the edge of the required cube. This equation is irreducible over the 
field of rational numbers. Each of its roots generates a field extension of degree 3. Such a field can 
never be contained in a field extension whose degree is a power of 2. 


The trisection of the angle « is equivalent to constructing a segment of length cos («/3), where 
cos « is given. The resulting equation is 4[cos («/3)]*> — 3 cos («/3) — cos « = 0. The question is 
whether the roots of the equation 4x* — 3x — cosa = 0 lie in an extension field of degree 2” 
(m a natural number) of Q(cos «), where @ is the field of rational numbers. It can be shown that 
the equation is, in general, irreducible, and then just as with the doubling of the cube each root 
generates a field extension of degree 3, and there cannot be a general ruler and compass construction 
method for the trisection of an angle. 

The squaring of the circle requires the construction of a straight line segment of length /z. Since 
7 is transcendental over the rational numbers, that is, does not satisfy any algebraic equation, this 
problem also is insoluble. 

The construction of a regular n-gon. The nth roots of unity divide the unit circle into m equal parts. 
A regular m-gon inscribed in the unit circle can be constructed by ruler and compass alone if and 
only if n is of the form 2'p, p2 ... p,, where / is a non-negative integer and p,, p2, ..., P, are distinct 
Fermat primes, that is, primes of the form 2” + 1. Thus, regular n-gons can be constructed for 
n = 3, 4, 5, 6, 8, 10, 12, 15, 16, 17, 20, 24, ..., 257, ..., using only ruler and compass. 


Applications 


Field theory has found manifold applications in other parts of mathematics. Its methods are used 
in Galois theory and algebraic number theory. Many classes of functions of interest in complex 
analysis (see Chapter 23.) form fields, for example, the rational functions and the elliptic 
functions. On the other hand, the methods of complex analysis are sometimes used in field theory, 
for instance, in the proof of the so-called fundamental theorem of algebra, and other investiga- 
tions on the field of complex numbers. 


Algebraic varieties are discussed in Chapter 32. and in Chapter 33. 


17. Linear algebra 


17.1. Systems of linear equations ..... 356 =:117..5. Matrices ..............00 eee 373 
17.2. Determinants ................. 359 ~—s: 117.6. Eigenvalues ................... 378 
17.3. Vector spaces .............000- 362 =—-:17.7. Multilinear algebra ............ 380 
17.4. Linear maps ..............000. 370 


17.1. Systems of linear equations 


In Chapter 4. it was examined under what circumstances, and by what means, equations of 
the form ax = 5 or ax + by = c can be solved, when a, b, and c are rational, real, or complex 
numbers. The first equation contains the variable x, and the second the two variables x and y. In 


17.1. Systems of linear equations 357 


both cases, the variables occur only to the first power; such equations are called Jinear in one or 
two variables, respectively. In general, a linear equation in n variables (or unknowns) x1, X2, +++, Xn 
is an equation of the form a,x; + a,x, +--+: + a,x, = b. 

The numbers a,, a2, ..-, a, are called the coefficients and b is called the constant or absolute term 
of the equation. For n = 1 and n = 2 one obtains the cases mentioned above. 

Many problems in mathematics lead not to a single linear equation, but to a whole system of such 
equations. 


Systems of linear equations. A simple example of a system (or a set) of simultaneous linear equations 
is given by the two adjacent equations. Here a,, a2, b,, 65, c,, and cz are axthy=e 
given numbers. A solution of such a system of equations is a pair of numbers “!~ * -!2 ~~ “? 
(x, ») such that when x is replaced by x and y by ¥, both equations are satis- 42* + bay = ¢2. 
fied simultaneously, that is, both become true proposi- 
tions on equality. Generalizing this situation one  11%1 + 12¥2 + *** + GinX¥n = 51, 
understands by a system of m linear equations in n  @y,X, + @z2X_ + +** + @p_X%_ = bz, 
variables (unknowns) X1, X2, .--, X, a System of the adja- . 
cent form. In this system the a,;; and 5; are given ,. ), 4 | * oy eet 
numbers for i = 1,...,m, and j = 1, ...,. The index i GmiX1 + GmaX2 + *** + Sunn = Om 
indicates in which equation of the system the number occurs, and the index j of the coefficient a, , 
indicates the unknown with which it is associated; for instance, a.3 (read ‘a two three’, ‘a sub two 
three’, not ‘a twenty three’) is the coefficient of the unknown x; in the second equation. A single 
linear equation forms a special case of a system, characterized by m = 1. 

A system of linear equations is called homogeneous if the constant terms b,, 52, ..., bj» are all 
zero; Otherwise, that is, if even one 5, is not zero, the system is called inhomogeneous. If in an 
inhomogeneous system all the constant terms are replaced by zeros, then the resulting system is 
called the associated homogeneous system. 

A solution of a system of m linear equations in nm unknowns is a sequence of numbers X, , X2, ---, Xn 
such that when all the unknowns are replaced by the corresponding numbers, then all the equations 
are Satisfied simultaneously. A sequence of numbers c,, C2, ..-, C, is called an n-tuple and is written 
(C1, Cz, --+) Cy). Two n-tuples (c;, C2, ---,C,) and (d,,d2,...,d,) are equal if and only if cy = d,, 
C2 = d,,..., and c, = d,. If the n-tuple (X,, X2, ..., X,) is a solution of the given system of linear 
equations, it may be necessary to check whether the values x, lie in the permitted fundamental 
domain of variability. For simplicity it will be assumed that this domain is the set of real (complex) 
numbers, provided that the coefficients and constants of the system are real (or complex) numbers, 
respectively. 

The investigation of the solutions of a system of linear equations leads to three problems, which 
have to be treated each in its own manner. The first is the question of the existence of solutions. This 
asks under what conditions a system has solutions; for even the single equation 0: x = 1 has no 
solution. The second problem is to find a method that gives a solution of a given system of linear 
equations. Finally, the third problem is to describe the totality of all solutions of the given system 
of equations. 


The existence of solutions. The following manipulations of a system of linear equations do not 
alter the solubility or non-solubility nor the solutions of the system: 


1. Addition of equations of the system to other equations of the system. 
2. Multiplication of equations of the system by non-zero factors. 
3. Changing the sequence of the equations. 
Example J: 2x; + x2 = 1 X, + x, =1 
x1 — 3x2 = 4 | —2x + 6x, = —8 | 


All the systems have the unique solution x, =1,%,=—1. 


The following criterion gives a theoretical insight into the solubility of a system. It can also be 
used in practice to prove that a given system has no solutions. 


A system of linear equations has a solution if and only if the following condition holds: whenever 
repeated application of the operations 1. and 2. leads to an equation in which all the coefficients are 
zero, then the constant term of that equation is also zero. 


Example 2: The adjacent system of equations has no solution. As described by the numbers in 
red, the equations are multiplied by —1, 1, and —1, respec- - | 
tively, and then added; the resulting equation is % = ys e ome | 


O-x, +0:x,+0:x;=1. —x, + X2 | rabce 


358 17. Linear algebra 


Homogeneous equations and the complete set of solutions. The problem of finding all the solutions 
of a given inhomogeneous system can be reduced in part to the simpler problem of finding all the 
solutions of the associated homogeneous system. This is a consequence of the following easily 
verified properties of solutions of homogeneous systems (note the analogy to the operations 
1. and 2.). 


1. If (%,, X2,---, X,) and (¥;, V2, ---, Vn) are both solutions of a homogeneous system of linear 
equations, then so is their sum (X,; + }1, X2 + Jo, ---, X,_ + ¥,), which is defined componentwise. 

2. If (X1, Xz, ---, X,) iS a Solution, then so is its multiple by a factor c, (cX;, cX2, ..., CXn). 

3. The n-tuple (0, 0, ...,0) is always a solution of any homogeneous system and is called the 
trivial solution. 


From the properties 1. and 2. it follows that if (x§), x$),..., XO); (x@, x9, ..., x), 
(xo, x9, ..., X™) are m solutions of a homogeneous system of linear equations, then SO is 


LE, XY, 2.2, HOY) 4 Ag(KY??, XO, 22, HO) Hoe A Ag (HMM, HY, «20, HOY) 
= AED AGED bone LAGE, ong ADE AQRD Loe ARM) 


for any real numbers A; (i = 1, 2, ..., m). Such a sum of multiples is called a linear combination. 
Thus, the statements under 1. and 2. together say that any linear combination of solutions of a 
system of homogeneous linear equations is again a solution. 

These properties do not hold for inhomogeneous systems. However, the following theorem 
shows the connection between the solutions of an inhomogeneous system and those of the associated 
homogeneous system. 


If a solution of the associated homogeneous system is added to an arbitrary solution of the in- 
homogeneous system, then the result is again a solution of the inhomogeneous system. If an arbitrary 
but fixed solution of the inhomogeneous system is chosen, then every solution of the inhomogeneous 
system can be obtained by adding a solution of the homogeneous system to the chosen solution. 


Solution by elimination and Gauss’s algorithm. This method can be used to find just one solution, 
or the whole manifold of solutions, for a given system of linear equations. 

The algorithm is particularly suitable for use on computers, and for this reason it has gained in 
significance in recent years. The basic idea of the method is the following: using the operations 1. 
to 3. the given set of m equations in m unknowns is transformed into a new set of m equations in n 
unknowns, in which one of the unknowns, say x,, Only occurs in a single equation. One says that 
x, has been eliminated from the other m — 1 equations. By the same method these m — 1 equations 
are transformed so that another unknown, say x2, only occurs in a single one of them. Repeating 
this process a system is finally obtained in which x, occurs only in the first equation, x. occurs only 
there and in the second, and so on. This kind of system is easy to solve. 

The following very simple example should clarify the working of Gauss’s algorithm. 


Example 3: 


3x; — 3x2 + x3 =0 (1) Equation (2) does not contain x, ; this occurs only in equa- 
4x, —x3=5 (2) tions (1) and (3). Multiplication of (1) by —2/3 and subsequent 

2x; — 2x, + x3 = 1 (3) addition to (3) yields a new set of equations (1’), (2’), and (3’). 
ES ae ee sits Aeon) fan bee ae tee 
ait Ia Sy =1 | @) stituting this value in (2’) the value X, = 2 is obtained and 


now (1°) yields ¥, = 1. 
A more difficult example is the case of 
three equations in four unknowns with gene- 2111 F 412%2 F @13%3 1 G1aX4 = by (1) 
se = 424 Xy + 422X2 + 423X3 + A24X%q = 52 (2) 
ral coefficients. By applying operation 3., if Bes xs A Hasta 3h. decked b (3) 
necessary, it can be assumed that a,, isnot “2!"! 1 @32%2 1 @33%3 1 @34%4 ~~ 03 
zero in the adjacent system S. 


Then the system S, can be obtained from 411%; + 2X2 “f+ Gy 3X3 +- dy4Xq4 = 5, (1’) 
S by subtracting a2,/a,;,; and a3,/a,, times , a> 2X2 + a3 3X3 + 434X4 = = pb, (2’) 
equation (1) from equations (2) and (3), @32X2 + G45X3 + A44Xq = Dg (3’) 
respectively. 


Here aj; = qj — 4,14; ;/a;,, and b; = b; — a,,b,/a,, (@ = 2, 3; j = 2, 3, 4). If all the coefficients 
of (2’) and (3’) are zero and 53 or 53 is not, then by the criterion of the preceding paragraph the 
system has no solution. If, on the other hand, b, = b; = 0, then x2, x3, and x4 can be chosen 
arbitrarily and x, determined by equation (1). If neither of these cases occurs, then by interchanging 
(2’) and (3° ), if necessary, and possibly renaming and interchanging the unknowns, it can be arranged 
that a2. is not zero. To avoid overloading the notation it is assumed that this is ‘already the case in 
the system S, as it stands (the reader is referred to the subsequent examples). 


17.2. Determinants 359 


The system S2 is obtained from S, by sub- 411%. + A12%— + 4y3X3 He Ay 4X4 = b, (1) 
tracting 75 a2 times equation (2) ae | dante be 3X3 + a aXa = 5; (2) 

e uation ere a3, = a3 a3 20> 452; $5 | = ‘ti 
and b3 = 63 — a's,b3]a,2 (j= 3, 4). As at Masts t Msgte = by me 
the previous Stage if aj, = a34 = 0, there are two cases to distinguish. If b; = 0, then x3 and x4 
can be chosen arbitrarily and x, and Xz determined from (1”’) and (2”). 

Now let a33 + 0. If x4 is taken as an arbitrary number d, then x3 is determined from (3”), x2 
from (2”’) and finally x, from neu ’). There is a solution for each d and all solutions can be obtained 
in this way. If a3, = 0 but a34 + 0, then the solutions are obtained in the same manner by inter- 
changing x3 and x4. 


Example 4: From the given system S, the new 
system S2 is obtained by interchanging the first 
two rows. Now in Sz 0/1 = 0 times equation 
(1) is subtracted from (2) and 3/1 = 3 times (1) 
from (3). This yields the system S3. S4 is obtained 
from S3 by interchanging x, and x,andrenaming 
them x4 and x}, respectively. Now in S, — 4/4 
times (2°) is subtracted from (3’), that is, the 
two equations are added, yielding S,.Ifin Ss one 
puts X, = X,=d, then X, = 2, ¥,—= % = 1, 

and ¥, = —7-+- 2d. Thus, the set S of solutions 
of S; isS = {(—7 + 2d, d, 2, 1); d real}. 


x1 + 3x,+ 4x3 — 2x, = 


Buca ct 4 
ae tet ae 


() 
(2’) 
6) 


X,+%X%2+ x3 =6 Two systems of four equa- 
2x, +X2— 3 | | | tions in three variables with | 2x, + x,— x, =0 
4x, — X2+ 2x3 the same coefficients but 4x, — X2+2x,=8 

—X, + X¥2+ 2x3 if different right-hand sides. —x fae + 2x, =7 
: + at oe Application of the algo- : 
|rithm leads to the following 
two systems: 


While the left-hand ayatean has the unique solution ¥,; = 3, %, = 2, and X = L, the right-hand 
system has no solutions at all because of the contradiction in the last equation. 


Geometrical interpretation. It is well known from analytic geometry that a linear equation in two 
unknowns defines a straight line in the plane. For if one associates with each solution (x, 7) the 
point with the coordinates (x, ), then the image of the set of solutions is a line. Hence two equations 
in two unknowns determine a pair of lines in the plane, and solutions of the system, if they exist, 
must be the points of intersection of the two lines. The following cases can occur (see Fig. 4.2-2, 4.2-3): 


(i) There are infinitely many solutions and infinitely many points of intersection of the two lines. 
In this case the two lines coincide; one of the unknowns can be given an arbitrary value, and the 
other is determined. 

(ii) The system of equations has a unique solution. In this case the lines intersect in a single point. 

(iii) The system has no solution. The lines are distinct and parallel. 

The situation is similar with equations in three unknowns x, y, z. Each single equation defines 


a plane in three-dimensional space. Again the solutions of the system are the points contained in 
all the planes. 


17.2. Determinants 


_ In methods of solving systems of 1 equations in unknowns, other than Gauss’s, certain func- 
tions of the coefficients, the determinants, play a decisive role. These functions are also important 
in other branches of mathematics, such as the differential and integral calculus of several variables. 


360 17. Linear algebra 


A determinant is a function of n? variables, usually written as a square Pe Rr Fe 
scheme of the adjacent form. The numbers a;; are called the elements or | ; 
entries of the determinant. The ith row of the determinant is the n-tuple 421 422 +--+ Gan 


* # * 


(4:1, «++» Qin) Of entries with i as their first subscript. The jth column is the : : : 
n-tuple. (a, fo cers An j) of entries with j as their second subscript. The value of io ts 
the determinant. is Y (—1)* ais, ars, °° * ns, Where the indices s,, ..., S, form ) 
a permutation of the numbers 1, ..., 2 and are therefore distinct. The sum is taken over all possible 
permutations of 1,...,”, that is, ‘the summands are all possible products containing exactly one 
entry from each row and from each column. Since there are n! permutations, the sum has a! terms. 
The sign (—1)}* is determined by the number & of inversions in the permutation; for example, in 


the product 413421434442 the permutation e i ; : 


before 2 and 4 before 2), so that the sign is —1. 
BNE em ee | 
ee (; >) and e 7 have O and | inversions, respec- |ao4 


ix Example 2: Computation of 3 x 3 determinants. The scitaibciilie Gk Dice Meni Seat 
inversions 


has k = 3 inversions (namely 3 before 1, 3 


43, Gz ayy 


D=/@2, 422 423 
_|431_ 432 33 
D = 444472033 — Gy 1423432 — Gy 202 1Q35 + Gy 247303, + 413421432 — 1342203). 


These rules for calculating 2 x 2 and 3 x 3 deter- 4), yz 4), GQyj2 @y3 Gy Ay2 
minants can be easily remembered in the following 
way. Write out the determinant and supplement @3; 32 @9, 432 G33 G2, G33 


the 3 x 3 determinant with the first two columns 7 
(Fig.). 43; G32 G33 Gy, G35 


The numbers connected by red lines are multiplied and the products are added; from this the 
sum of the products of numbers connected by blue lines is subtracted. The result is the value of the 
determinant. For 3 < 3 determinants this is called Sarrus’ rule. 


Properties of determinants. The following statements can be derived from the definition of the 
determinant: 


1. The determinant is a linear function of the entries of each row. 

2. If two rows are interchanged, the determinant changes sign. 

3. The value of the determinant is zero if one of the rows is a linear combination of the others, 
in particular, if the entries in one row are all zero or if two rows are identical. 

4. The value of the determinant does not change if to a given row a linear combination of the 
other rows is added. 

5. The value of the determinant does not change if rows are made into columns and vice versa. 


The first statement means: (i) A factor common to all the entries of a row can be taken before 
the whole determinant. (ii) If the entries of a row can all be written as a sum of m elements, then the 
whole determinant can be written as a sum of m determinants in which the other rows remain 
unchanged. For example: 


Qy14 Q12 Q13 Qy14 G2 443 Qy, Qy2 ay43 
(421 + b21) (22 + 522) (G23 + b23)| = |421 422 G23} + ]521 522 423 
a31 a32 a33 Q31 432 433 Q31; 432 433 


The statements 3. and 4. are immediate consequences of 1. and 2., and 5. means that all theorems 
on rows of determinants are also true for their columns. 
The rules 3. and 4. are frequently used in the practical computation of determinants. 


Example 3:|5 3 —1 | ee ia | : Bey Gee 
0 0 O}=0; (2 4 Si=0; [2 4 S| =0; 
719 8 ee eae 467 


17.2. Determinants 361 


1 2 35 O+1 141 4-1 243 0 1 42 
2-2 84 1 -1 4 2 1-1 42 
1 1-1 3) 7 {1 1 —1 3 1 7lt 1-1 37 
7 021 7 0 2 1 7 021 
1 1-1 3 0 1 4 2 0 1 42 
1-1 42 (1—0) (-1—1) 4—4) 2-2) 1 —2 0 O| (see 
215°9 = 2 = Exam- 
1 1-1 3 1 1 = 3 1 1-1 3] ples) 
7 021 7 0 2 1 7 021 


Minors. If any m rows and any m columns of an n x n determinant are deleted, the resulting 
(n — m) X (n — m) determinant is called a minor of the original one. 


Example 4: Deleting the | a), M2 43 Qy4 ays 
2nd and 5th rows and 2nd 


and 4th columns of the lar- | @2t 422 423 G26 Gas 145, 3 5 
niet ec lai yields the 43, @3g 433 G34 a35 |? | a3, 433 435 
G41 G42 G43 a4 Gas G41 das as 


asi ds2 G53 4s, 455 


If only the ith row and the jth column are deleted, one obtains an (nm — 1) X (n — 1) minor. 
If this determinant is multiplied by the sign (— 1)‘*/, then its value is called the cofactor or algebraic 
complement of the element a,, and is written 4;,. 


Computation of determinants. The cofactors play an important part in the computation of deter- 
minants as the following theorem shows: 


Theorem (development of a determinant by rows): Every determinant D can be computed from 
the elements of any fixed row and their cofactors: D = a,;A,, ~~ a@;2A;2 + ++ + G@jnpAqn 


By 5. the analogous statement for columns is also true. 


Example 5: Development by the second row. 


0-1 o4 21 ieee ae G43) 
De DO og aye ae CD Pet 3 
SS tees | | 
POs ay O21 ae ae 
0. -Ps9 O12, 2) 
+0-(—1)7#3|1 1 3}+0-(—1)?+4|1 1 —1] = —189. 
7.00 io: 2 


Thus, the development theorem reduces the calculation of n x n determinants to the calculation 
of (2 — 1) X (a — 1) determinants. Determinants with 2 or 3 rows can be computed directly by 
this method. 

The example shows that it is advantageous to develop by a row with many zeros. The rules 1. 
and 5. can frequently be used to obtain such a row. 


Solution of systems of linear equations by determinants. If the coefficients a,, of a system of n 
linear equations in 7 unknowns are written (in the order in which they appear in the system) as the 
entries of a determinant D, then the determinant D, is defined as the one obtained from D by deleting 
the jth column and replacing it by the column of constants on the right-hand side of the system. 
The determinants D and D, can be used to solve the equations provided that the value of D is not 
zero. 


Cramer's rule. If the determinant D of the coefficients of a system of nm linear equations in m un- 
knowns is not zero, then x, = D,/D (j = 1, 2, ..., a) is the only solution of that system. 


362 17. Linear algebra 


Example 6: (Example 3 of solution by elimination): 


3x, — 3x, +x, =0 3-3 1 0-3 1 
4x, —x3;= 5), D=|0 4-lI)=4 D,=|5 4—-1]/=4, 
2x; — 2X2 + X35 = 1 i2 —2 1 il —2 1 
i aaa 3-3 0 xX, = D,/D=1, 
D,=|/0 5—1/=8, D,;=|0 4 5|/=12; %2 = D,/D=2, 
2: 1 1 ty a | X,= D,/D = 3. 


In a homogenous system all the determinants D, are necessarily zero. Therefore such a system has 
non-trivial solutions only if D is 0. The converse is also true. 


A homogeneous system of n linear equations in n unknowns has non-trivial solutions if and only 
if the determinant of its coefficients is zero. 


17.3. Vector spaces 


Introduction. The exposition in the section on ‘Homogeneous equations’ shows that the solutions 
of a homogeneous system form an example of a set of objects that can be added together or multiplied 
by numbers without leaving the set. A generalization and abstract definition of these properties 
leads to the concept of a vector space, which is central for the whole of linear algebra. 


Linear algebra can be regarded as the theory of vector spaces. The elements of a vector space are 
called vectors, the numbers by which they can be multiplied are called scalars. The set of scalars 
can be the rational, real or complex numbers. Other more general structures can also be used (fields; 
see Chapter 16.). In what follows the set of scalars will always be taken to be the real numbers. 
The characteristic rules in a vector space V are the following (x, y, z are elements of the vector 
space V, that is, vectors, and a and b are scalars). 

. Associative law of addition: (x + y) + z= x + (y + 2). 

. Commutative law of addition: x + y = y + x. 

. Existence of zero: There exists an element o in V such that x + o = x for all x in V. 

. Existence of inverses: To every x in V there exists an element —x in VP such that x -+- (— x) = @. 
. Associative law of multiplication: a(bx) = (ab) x. 

. Unital law: 1x = x. 

. First distributive law: a(x + y) = ax + ay. 

. Second distributive law: (a + 6) x = ax + bx. 


Every set in which an addition of the elements of the set and a multiplication by scalars are defined 
so that the results always lie in the set and the laws I. to 8. hold is a vector space. 


Examples of vector spaces. |. The set of all polynomial functions (integral rational functions) forms 
a vector space. If f(x) = a,x" + «++ + ayx + agand g(x) = b,x" + -+- + b,x + bo are polynomial 
functions and n > m, say, then their sum is f(x) + g(x) = qx" + °°" + Giga y Xt! + (a + by) x” 
+ +++ + (a, + b,) « + (@o + 40). The product a - f(x) of a real number and a polynomial function 
is a- f(x) = (aa,) x" + -++ + (aa,) x + (aag). It is now easy to verify the rules 1. to 8. The zero 
element o is the polynomial function f(x) = 0. 

2. The set of all differentiable functions and also the set of integrable functions form vector spaces. 
The zero element is again the function f(x) = 0. The functions are added by adding their values 
and multiplied by a number by multiplying their values by that number. 

3. The sets of real numbers and complex numbers form vector spaces with the usual multiplication 
and addition. 

4. The set of n-tuples (a, , a2, ..., @,) with real entries a, form a vector space R” for every natural 
number nw. For nm = 2 they are also called ordered pairs, for n = 3 triples and for n = 4 quadruples. 
Addition is defined by (@,, a2, ..., @,) + (6,;, 42, ..., 4) = (a; + b,, az + bz, ..., 4, + 5,) and 
multiplication by a(a@,, a2, ..., 4,) = (aa,, aaz, ..., 4Q,). 


Vector algebra 


The vector space V3. In this paragraph the properties of a particulary important vector space 
are investigated. It plays a central part in physics and technology; and it clarifies the importance 


17.3. Vector spaces 363 


of vector spaces and thus of linear algebra in practical applications. The name vector was first used 
for elements of this particular space and later generalized to the present terminology. 

The vectors will first be described geometrically, starting with the three-dimensional space, in 
which length, breadth, and height are defined. A shift of the space consists in associating with each 
point P a point Q such that the (oriented) line segments se points with their images are 
parallel and all have the same length. Such a 
shift is called a translation or vector. 

From the definition it follows that a vector is 
completely determined if its effect on a single point P is known, that is, if the point Q associated 


with P is known. Therefore the vector can be characterized by drawing the line segment PO and 
putting an arrow-head at Q to indicate that P goes toQ. Such an oriented segment is called a re- 
presentative of the vector. Here P is called the initial point or point of application of the vector 
and Q is called its end-point (Fig.). 

For every point P there is a representative of any given vector with P as its initial point, and every 
point Q also occurs as end-point of a suitable representative. Different representatives of the same 
vector are parallel and have the same length. Hence it makes sense to define the /ength or modulus 
or norm of a vector a as the distance between the points P and Q of any representative of a. The 


length of a is denoted by |a| or |PO|. It is always non-negative. Vectors of length 1 are called unit 
vectors. 


Q Q; Qs 
a / © P. g i] 
P Q B 2 Q> 
, 2» —_: @: ee 
17.3-1 Representative P pe 2 
olaryector 17.3-2 Equality of vectors a=b a+b a+b 


In what follows most of the other concepts of vector algebra will be introduced by representatives. 
But it is incorrect to identify a vector with a single representative. If this is done, one usually attempts 
to avoid the resulting difficulties by some sentence such as ‘vectors can be shifted in any direction’. 
But this sentence really means that by a parallel displacement of a representative of a vector one 
obtains another representative of the same vector. This leads to > the nore proposition. 


To ablain a vector space, addition of vectors and multiplication by ealats must still be defined. 


Vector addition. The sum of two vectors a and b is the shift a + 6 obtained by performing the 
shifts a and 6b in succession. Using representatives the sum can be defined in the following way: 
——> ° © e ——> e e a ——> e e 
If PQ is a representative of a in P and QR is a representative of 6 inQ, then PR is a representative 
of a + b (Fig.). It is easy to see that this definition does not depend on the choice of representatives. 


If one considers the representatives PO, PO’, OR and OR of a and 5 in P, a inQ’, and bd inQ, one 
obtains a parallelogram with its diagonal PR representing both a+ 6b and b+ a (Fig.). Con- 


sequently a + b= 6+ a and the commutative law of addition holds. It is just as easy to verify the 
associative law of addition (a + 6) +c=a-+ (6+ c) (Fig.). 


17.3-5 Associativity of 
, 17.3-4 Commutativity of vector addition 
17.3-3 Addition of vectors vector addition a+(b+c)=(a@+5b)+e 


364 17. Linear algebra 


The validity of these laws implies the following rules for adding more than two vectors: 


Several vectors can be added by choosing any sequence of representatives, one of each vector, 
such that the end-point of each representative is the initial point of the next. The sum or resultant is 
the vector represented by the segment going from the initial point of the first representative to the 
end-point of the last. 


The null (or zero) vector. The translation that shifts a point P to P itself, and thus leaves all points 
of space fixed is the null vector, written o. It cannot be given a particular direction, its length is 0. 
It has the characteristic property that a + o = a for all vectors a. 


Subtraction. To define subtraction of vectors one uses the existence of a unique inverse to every 
vector. If a+ b= 0, then b= —a and a representative of 5 is obtained by interchanging the 
initial and end-points of a representative of a (Fig.). Thus, —a@ has the same length as a, but the 
opposite direction. In particular o = —o. 

It now makes sense to say that the difference a — b of two vectors is the sum of a and —5, 
a—b=a-+(—B). 


Multiplication of vectors by scalars. If the points 


>, p : Q Q' P,Q and Q’ (P +Q, P +Q’) lie on a line, then 
- C—O —s«éPQ and PQ" are representatives of vectors a and a’ 
ae if ae | PQ’=a in the same direction or in opposite directions. 
ae However, in general, the length of a and a’ will be 
17.3-6 pees - ened But a is a real Wael d> 0 such 
tatives of opposite a @ 34 Oo! that |a’| = d- |a|, namely d = |a’|/|a| (Fig.). 
vectors ae pier yp" If a and a’ have the same qk one defines 
17.3-7| Multiplication a a’ = da, with d= |a’|/|a|, if they have opposite 


directions, one defines a’ = —da. This leads to 
the following definition, which also covers the 
cases a = o or d= 0. The product d-a of a vector a by a real number d is the vector of length 
d-|a|, in the same direction as a if d > 0, and in the opposite direction if d< 0. For d= 0 one 
defines 0: a = a. 


In particular, it follows that 1- a= a, (—1):a = —a,d:o=o0 foralldandn‘a=a+a+---+a 


of vectors by scalars Q” 


n times 
for every natural number 7. If a + 0, then vector a/|a| has the length 1, hence is a unit vector, which 
in what follows will be denoted by a°. Thus, a = |a|-a°. The associative law of multiplication 
and the two distributive laws are easily verified. In this way a vector space has been corstructed 
consisting of the translations of three-dimensional space. It will be denoted by V3. 


Components and Coordinates in V3. 


To make geometry amenable to com- hz 
putational methods one introduces a , 
system of coordinates, for example, an ‘ 


orthogonal (Cartesian) system with x-, z°G,R) 
y-, and z-axes. The perpendicular pro- 
jections onto the axes of a representa- 


tive PQ of a vector a are again repre- 
sentatives of certain vectors. These 
vectors, which are independent of the 
choice of representative, are called the 
components a,, a, and a, of a with 
respect to the given system of coordi- (7,0,0) 
nates (Fig.), and a = a, + a, + a@,. 


i 
ri 
——————EEE 


ia Al ba Hy / 
ri | 
Fa 
dy = Ox! 7 
a a 


17.3-8 The components of a vector sgh 
A 


If i, j, and k are the basis vectors of the coordinate system, that is, the unit vectors in the positive 
directions of the x-, y-, and z-axes, respectively, then a, = a, +i, ay = dy° j, @, =a," k. The real 
numbers a,, a,, and a, are called the coordinates of a with respect to the given coordinate system. 


17.3. Vector spaces 365 


If PO represents a and P and Q have the coordinates (x9, yo, Zo) and (x1, 1, 2,), respectively, 
then the components of a@ are (x; — Xo) i, (91 — yo) Jj, and (Zz; — Zo) k, respectively. Thus, the 
coordinates of a are the differences between the coordinates of the end-point and initial point of 
an arbitrary representative of a. 

Since every vector determines a triple of coordinates and, on the other hand, every triple (a, , a2, a3) 
determines a unique vector @ with a, = a,, a, = a2, and a, = a3, the vector space V3 can be iden- 
tified with the vector space of triples of real numbers R*. But for this to make sense it must be 
shown that addition and scalar multiplication in V> corresponds to addition and scalar multiplica- 
tion in R°, in other words, that the coordinates of a + b and d: a are (a, + b,, ay + by, a, + 5) 
and (da,, da,, da,), respectively. And indeed, from a = a,i+ a,j+a,k and b= 6,i+ b,j+b,k 
it follows that a+ b= (a, + 6,)i+ (4+ 6)j 4+ (4, + 6) k and d:-a=(d-a)ji+ 
+ (d+ ay)j + (d-a,) k. 

In the first case the commutative and associative laws of addition and the second distributive law 
are required; in the second the first distributive law and the associative law of multiplication. One 
says that addition and multiplication are performed componentwise or coordinatewise. Since —a= —1a 
and thus has the coordinates —a,, —a,, and —a,, subtraction is also performed componentwise. 


Vectors are added (or subtracted) by adding (or subtracting) their coordinates; a vector is multiplied 
by a scalar by multiplying its coordinates by that scalar. 


Example 1: For a= 2i+ (1/2)j/—k&; 6= —3i+ 2j+ 5k and d=2 one obtains a+ 6 
= —i-+ (5/2) 7 + 4k; —b = 3i — 2j — 5k; a — 6 = Si — (3/2) j — 6k; and da = 4i + j — 2k. 


Thus, there is a bijective (one-to-one and onto) map of V3 onto R°, which preserves addition and 
scalar multiplication. The vectors i, j, k are taken to the triples (1, 0, 0). (0, 1, 0) and (0, 0, 1) and the 
arbitrary vector @ = a,i-+- a,j + a,k goes to (a,, a,, a,). Although the space V3 is more intuitively 
defined, calculations are more convenient in R°, because they give a much clearer picture of the 
operations in V3. V3 and R® have the same structure as vector spaces, but different sets of objects. 
In what follows they will not be regarded as the same space (see Linear maps). 


The inner and vector product in V3. In V3 there are two further natural operations. One of them 
associates a scalar with any pair of vectors and is called the inner or dot product, because the product 
of a and B is written as a: b. The other produces a vector and is called the vector or cross product, 
because this product of a and 6 is written as a X 6b. The inner product can be generalized to other 
vector spaces, but this is not immediately possible for the vector product. 

Both products have physical applications. For instance, the work done by a force F moving 
along a straight path s is calculated by the dot product Fs, and the velocity v of a point P ona 
body rotating about an axis is calculated as the vector product of the radius from the axis to P 
and the angular velocity. 


The inner product. The representatives with the same initial point P of two non-zero vectors a 
and 5b enclose an angle « between 0° and 180° and an angle 8 between 180° and 360° such that 
« + B = 360°. The angle <{(a, 5) between the vectors a and 5, is defined as the smaller angle «. 

The inner or dot product a: b (read ‘a dot 5b’) of two non-zero vectors a and b is*defined as the 
real number |a| |5| cos <{(a, 5). 

The commutative law a+ b = b- a holds; one says the inner product is symmetric. The associative 
law does not hold, in general, for the product of three vectors, because (a: b)- c is a multiple of c, 
whereas @- (6 - c) is, if anything, a multiple of a. On the other hand, there is a composition law for 
the multiplication by scalars and the dot product. Using the commutative law as well one obtains 
the following equation: 


(ab - c) = a(b- c) = a(c: b) = (ac° b) = (b° ac). 
An operation on vectors satisfying this law and the distributive laws is called bilinear. The distributive 
law always holds a: (6+ c)=a:b-+ ac. 
Two vectors a and Bb are called orthogonal if a: b = 0. If both a and 6 are non-zero, this means 
that cos <{(a, b) = 0 or « = 90°, in other words, the representatives of @ and b are perpendicular 
to one another. 


366 17. Linear algebra 


The dot product cannot be inverted, for one cannot define a unique vector a in a meaningful 
way such that a: b= c (or such that ‘a is the quotient of the real number c and the vector 5’). 
For given 6 and c there are always infinitely many vectors a satisfying Q 
the equation a: b= cc; for example, if c = 0, then any multiple of a 
vector a+o0 orthogonal to b will do. Division by vectors is not per- 
missible. 

In the dot product a: 5b, the vector a can be replaced by the vector a, 
of length |a| - cos <{(a, 6) and in the same or the opposite direction to b 
according as <{(a, 5) is less than or greater than 90°. A representative of 
a, can be obtained by taking the perpendicular projection of a represen- 
tative of a@ onto the line through a representative of 5, with the same 
initial point (Fig.). The same can be done for 6. Thus, a@- b= a,°b 17,3-9 Projection of 
= b,: a. The product a,-56 has the desirable property that a,:°5 a vector onto another 
-_ = {a,j - |b|, whereas a: b = |a| - |b| - cos La, 5). 


The dot product in coordinates. To calculate the dot product in coordinates all one needs are the 
values of the dot products i: i, i-j, ....&k-°k (and the distributive and associative laws). From the 
definition these are easily seen to be 1 if the two factors are equal, and 0 otherwise. One can thus 
compute the dot product of a and 5 from'their coordinates without knowing the angle between 
them, and indeed one can use this to obtain the angle between two vectors from their dot product. 


y= MA) and b= 14-2) —28 on obsine coe (6,8) = 19 and 


The v vector product The vector product axhb (read a cross b’) of two non-zero vectors is 
defined to be the vector c that has the following properties 1. a- c = 5: c = 0, that is, c is orthogonal 
to both a and 5b. 2. |c| =a: bsin X(a, 6b) and 3. the deter- 


a, Qa, a, 


y 
minant|b, 5, 5,|, formed from the coordinates of a, b and 


Cy Cy C, 
c in the given way, IS non-negative. If @ or b is the null vector, 
then a X 6b is also the null vector. 

To find a geometrical interpretation of the vector product of 
two non-zero vectors that are not scalar multiples of one 
another, consider the plane. spanned by two of their respective 
representatives, PQ and PQ’ (Fig.). Then the properties given 
above have the following meaning for the representative 
17.3-10 The vector product PRof c: 


1. PR is perpendicular to the plane spanned by PO and PO’. 

_2. The length of PR is \a| |5| sin <{(a, 6), which is the area of the parallelogram with the sides 
PO and nd PO’. 

3. PO, PQ’ P PR form a Lal loaded system. This means: viewed from R, the shorter of the two 
Possible rotations taking PQ to PQ’ is anti-clockwise (if 1 the thumb of the right hand points in _the 
direction of of PO ar and i the first finger in the direction of PQ’ then the palm faces the direction of PR). 

If PO, PO’, PR, PR’, are representatives of the vectors a, b b, a X b, and b X a, then PR and PR’ 
are both perpendicular to _the plane le spanned | by PO and PO’, and they both have the same length. 
But ince both PO, PO’, PR and PO’, PC, PR’ PR’ form right-handed systems in the given orders, PR 


and PR’ must point in opposite directions. Thus, a x 6 = —b X a. This law is known as anti- 
commutativity. The vector product is not commutative or associative, but it is bilinear, that is, if 
a is a scalar and x, y, and z are vectors, then a(x X y) = ax X y= x X ay and x X (y+ 2) 
= (x X y) + (x X z)and (x + y) X z= (% X z) + (y X 2). 


17.3. Vector spaces 367 


The vector product in coordinates. The definition immediately gives the values for the vector 
products of the basis vectors i, j, and k. The bilinearity of the product then makes it possible to 
calculate the coordinates of a < b. They are the determinants 


a, a, ay 


b, by b, b,| 


a, 4G, 


— , and 
b, 5, 


This can be easily remembered by using the following mnemonic device. Write down a 3 x 3 deter- 
minant in which the first row is i, j, A, and the second and third rows are coefficients of a and b. 
If the value is calculated and the terms containing i, 7, and & are sorted, then these are the components 
of a X Bb. 


Examples 3: The vector product of ij sk 
a= 5i-— 37+ kA and b= —i — j — 2k is ea ee he ee ee 


—| —!1 

Basis and dimension. Several concepts that played a part in the discussion of V3 can also be useful 

in the analysis of other vector spaces. Examples are the introduction of coordinates and representa- 

tion of operations in coordinates. However, some ideas, such as the vector product, cannot be 
generalized. 


| = —Si— 11f — 8k. 


Linearly dependent and independent vectors. If x;, x2, ..., x, are vectors of a vector space V and 
x is a vector in V, x is said to be linearly dependent on x, , X2, ..., X, if there are numbers a, , a2, ..., dn, 
such that x = a,x; + a2x2,-+-:- + a,x,. One also says that x depends on x,, X2,..., xX, or that 
x is a linear combination of x;, X2,..-, X,. For instance, every vector a of V3; depends on the system 
xX; =i, x, =Jj, and x3 = k:a=a,i-+ a,j + a;k. 


Obviously the zero vector o depends on any system of vectors x,,X2,.--,X,. One need only 
choose a; = a2 = ::: =a, = 0. 
If x is linearly dependent on x,, X2,...,X,, then there are numbers a,, a2, ...,a,, and a = —1 


such that 0 = a,x; + a2x2, +--+ + a,x, + ax. On the other hand, this equation does not state 
that x is dependent on x,,x2,...,x,. What it means is that if at least one of the coefficients 
41, 92,.---,4,, OF a iS not 0, then the corresponding vector is dependent on the rest. This leads to 
the following definition. 


The concept of linear independence is particularly important, because it is a necessary and suf- 
ficient condition for the solution of the equation 


X = A,X + A2X2 T -°* TF AnXp 
to be unique for all x that depend on x,, x2,...,X,. In other words: x,, X2,...,X, are linearly 
independent if and only if every vector x can be written in one and only one way as a linear com- 
bination of x,, X2,.--, Xn, OF not at all. 


Example 4: The vectors i, j of V3 form a linearly independent system. For if o = a,i + a2, 
then 0 = o- i = (a,i + az j)* i = a, and similarly 0 = a,. 
The three vectors i, j, k are also linearly independent, as can be proved in the same way. They 


have the further property that every vector is dependent on them: a = a,i-+ a,j -+ a,k. This leads 
to the following definition. 


368 17. Linear algebra 


From the above it follows that this is equivalent to the following definition: 


————_£_{_——_— 


FS fae See ea a WE wcll at. 
+a La oo on wa et ( c ar 


Thus, the system i, j, k forms a basis of V3. 
If there is a finite basis in a vector space V, it is called finite-dimensional, otherwise infinite-dimen- 
sional. For finite-dimensional vector spaces the following theorem holds: 


If V is finite-dimensional, then any two bases have the same number of elements. This number 
is called the dimension of V. 


The dimension of V3 is 3, since i, j, K is a basis, thus, every basis of V3 contains exactly 3 elements: 


Subspaces. A non-empty subset S of a vector space V is called a subspace if with the same addition 
and multiplication by scalars it satisfies the vector space axioms. This means, in particular, that the 
sum of two elements of S lies in S and any scalar multiple of an element of S lies in S. In fact, 
these two properties are the only ones that need to be verified, as the others all hold automatically. 


A subset V’ of V containing at least one element is a subspace if and only if the sum of x + y 
of any two elements x and y of V’ lies in V' and the scalar multiples ax of an element x of V’ all 
lie in V’. 

Example 5: The set consisting only of o is a subspace of every vector space. Every vector space 
a subspace of itself. These are the frivial subspaces. The first one has dimension 0. 

Example 6: If x + o is a vector of V, the set V’ of all scalar multiples of x is a subspace of V, 
the subspace spanned (or generated) by x. It has x as a basis and its dimension is 1. 


Coordinates. If x,, X2, --.-, X, is a basis of V, then by definition every vector x can be written 
in just one way in the form x = a,x, + a2x2 +--- + a,x,. The real numbers 4,, a2, ...,@, are 
called the coordinates of x with respect to the given basis. The coordinates of x change if the basis 
is changed, but their number is always equal to the dimension of V. 

If two vectors x and y are given by their coordinates with respect to the same basis, then it follows 
from the vector space axioms that they can be added coordinatewise. Similarly x can be multiplied 
coordinatewise by a scalar c. 


Thus, given a basis in a n-dimensional vector space V, one can associate uniquely with every 
vector x an n-tuple (a; , a2, ...,@,) in R" and vice versa; what is more, this association preserves 
addition and multiplication by scalars. Such a mapping (association) is an isomorphism (see Linear 
maps) of V onto R”. Nevertheless, in the investigation of arbitrary vector spaces it is not advisable 
to exploit this isomorphism with R", for it depends on the choice of a basis in V; this introduces 
an element of arbitrariness, and many investigations can be considerably complicated if the chosen 
basis is inconvenient. 

R" has a standard basis, namely e, = (1,0, ...,0), e. = (0, 1, 0,...,0),..., e, = (0, ..., 0, 1). 
For the subsequent discussion this is always taken to be chosen once and for all. In this basis 
the vector x = (a, , @2, ..., @,) has the representation x = a,e; + d2e2 + --: + ane,. 


The inner product. Generalizing the inner product of V3 one defines the inner or dot product in 

R” to be 
X°y = (41, 2, .--, Qn) * (by, bz, ---, bn) = 1b, + A2b2 + +++ Andy. 

The dot product of x with itself is written x? for convenience. Just as in V3, the dot product is 
symmetric and bilinear. 

The length, norm, or modulus |x| of a vector x = (a,, 42, ..-, 4,) is also defined by generalizing 
from V3: |x| = Vx? = V(aj + a3 +--- + 47). im 

In V3, if PQ represents a and QR represents b, then the third side PR of the triangle POR represents 
a ae b. It_is an axiom that in a triangle no side is longer than the sum of the other two, so that 
|PR| < |PQ| +|QR| or |a + b| < |a| + |d|. 

This is therefore called the triangle inequality; it can be proved (by induction) for R” as well. 
It can also be extended (by a very easy induction) to the form 


[xy + x2 +) + Xm] < [xa] + [x2] + + + [aml- 
The triangle inequality implies also that ||x| — |y||< |x — »| and, for the dot product, that 
|x -»| < |x| |y|. If this is written in coordinates, it becomes the Cauchy-Schwarz inequality, which 
is occasionally called the Bunyakovskii inequality. 


17.3. Vector spaces 369 


In the proof of the Cauchy-Schwarz inequality | one uses the bilinearity and symmetry of the 
dot product to prove (x + y)? = x? + 2x yr ha 
Similarly one proves (x + y): (x — y) = x? — y?. 


Angles. In the section on V3 a formula was found for the cosine of the angle between two vectors 
in terms of their inner product. In R” this same formula is used to give an enayucal definition of 
angles. The angle <{(x, y) between two non-zero vectors x and y is that — Dp Magy ae 
angle between 0 and x whose cosine satisfies the adjacent formula 


This condition determines <{(x, y) uniquely. This definition of an angle mares it plausible s call 
x and y orthogonal if x- y = 0. 

In the basis e, , .-., €, of R” all the vectors have length 1 and any two distinct vectors are orthogonal. 
This is a fundamental property of this basis. 


Inner product x y= (a, me » (5), ---5 Oy) xy mash +. +ab,= yx 
Rules x -Qytzg=x'yrexce 
d(x + y) = (dx): y = x- (dy). 
| Norm of x |x] = V(aj + -++ + a?) 


Generalized triangle inequality 


Cauchy-Schwarz inequality |x-»| < |x| |»| > ei ajb, < Sa a 


0<¢2(,y)<2 


Angle between x and y cos < (x,y) = Bid 
| z a es (=| |¥'| 

Euclidean vector spaces. The introduction of an inner product gives R" a structure additional 
to that of an ordinary vector space. The inner product associates with each pair of vectors x and y 
a real number x - y and can therefore be regarded as a function of the variables x and y satisfying 
certain properties. 

In R" coordinates were used in the definition. But the concept of an inner product can be generalized 
to arbitrary vector spaces as follows: 


If ¥ is a vector space and qa function that associates with every pair ir of b vectors x and yin Va 
| real number, then q is called an inner product on V if the following rules hold: 

I. q(x, ¥) —> q(y, x), 3. q(ax, y) ez aq(x, y), 

2. q(x + x’, y) = q(x, y) + a(x’, ), 4. q(x, x) > 0, and q(x, x) = 0 if and only if x = o. 

A vector space equipped with suchan inner product is called a Euclidean vector space. If no con- 
fusion is to be feared, the function q(x, y) is written as (x, y). 


The inner product on R” has these properties, hence R" is a Euclidean vector space. A fatietion 
satisfying 1. is said to be symmetric; if it satisfies 2. and 3. and the corresponding laws in the second 
term y (which follow automatically for symmetric functions), it is called bilinear. The last 
property 4. is called non-singularity. It indicates that q(x, y) = x-y is positive definite, that is, 
x: y = 0 for all y is only possible when x = o. This is equivalent to the statement that the matrix 
of the e; : e, is non-zero, where the e,; form an arbitrary basis. 

If V is a Euclidean vector space with a given inner product g, one can define length and angle 
by generalizing the definitions in R". The length, norm, or modulus of a vector x is |x| = -Vq(x, x). 
Property 4. of g states that every non-zero vector has positive length. If x and y are non-zero vectors, 
q(x, y) . 


then the angle g between 0 and 72 satisfying cos y = is Called the angle between x and y. 


Vectors of length 1 are called unit vectors. If q(x, y) = 0, then x and y are called orthogonal (with 


respect to q). 
The properties of the basis e,, ..., e, suggest the following definition: 


A basis of a Euclidean vector space is called orthonormal if all its vectors are unit vectors and 


if any two distinct vectors of the basis are orthogonal. 


In the investigation of Euclidean vector spaces one usually tries to find an sithonceiial bagi 
because its properties simplify many computations (in particular, the inner product can be computed 
by the ordinary formula for the dot product using the coordinates with respect to an orthonormal 
basis). 


370 17. Linear algebra 
17.4. Linear maps 


Properties of linear maps. A map A from a vector space V to a vector space V’ is called linear 
if for any vectors x and y of V and any real number a the equations A(x + y) = A(x) + A(y) and 
A(ax) = aA(x) hold. This means that it does not matter whether one performs calculations on 
vectors in V and then applies the map to the result or first applies the map to the vectors and then 
does the corresponding calculations with their images in V’. The final result will be the same vector 
in both cases. In other words, the equations express the compatibility of the map with the fundamental 
vector space operations in V and V’. The map A is sometimes expressed by an arrow A: V—> V’ 
or x — A(x). Here the uniquely determined vector A(x) is called the image of x, and x is called a 
pre-image or inverse image of A(x). A vector in V’ may have no pre-images, one pre-image, or more 
than one (see Chapter 5.). 


1. 
hy 


Aix+y! 


17.4-1 Rotation of a plane about a fixed 
point O through the angle ¢ 


17.4-2 Parallel projection of the three- 
dimensional space onto a plane, and line- 
arity of the parallel projection, 

_—_> —_ > _ 

PQ = x, PR=y,PS=x + y, PT = ax, 
Pp’ 0’ = A(x), P’R’ = A(y), 

P'S’ = A(x) + AQy), PT’ = aA(x) 


Example ]: If a plane is rotated through an angle m about a point O (Fig.), then every class 
of directly parallel oriented segments of the same length is taken to another such class. The rotation 
induces a map A of vectors by defining the image of x to be the vector associated with the class 
obtained after rotating the class of representatives of x. The adjacent drawing in which the rotation 
is shown for vectors x, y and x + y shows that A Is linear, because the parallelogram given by 
x and » is rotated as a whole. The equation A(x + y) = A(x) + A(y) holds. The relation A(ax) 
= aA(x) follows immediately from the fact that rotations preserve length. 

Example 2: Similarly any parallel projection of three-dimensional space onto a plane also gives 
a linear map of the corresponding vector spaces (Fig.). The equation A(x + y) = A(x) + A(y) 
holds, because the parallelogram defining the sum x + »y is mapped to the parallelogram defining 
A(x) + A(y). The equation A(ax) = aA(x) follows from the proportion | PQ): |PT|=|P’Q"|: |P’T’|. 


Kernel and image of a linear map. With each linear map A from V to V’, there are associated 
two distinguished subspaces of V and V’, respectively, its kernel and its image. The kernel of A is 
the subspace of V consisting of all those vectors that are mapped by 4A to the null vector of V’. 
The image of A is the subspace of V’ consisting of all those vectors of V’ that are images of vectors 
in V. In Example 2 the image of A is the plane, onto which the space is projected, and the kernel 
is the set of those vectors in three-dimensional space that are parallel to the direction of projection 
(and of course the null vector). If V is finite-dimensional and A is a linear map from V to V’, then 
both the kernel and the image of A are also finite-dimensional. The dimension of the kernel is called 
the nullity of A, and the dimension of the image 
is called the rank of A. An important theorem 
on linear maps then states that 
From this theorem it follows that the dimension of the image of A is at most equal to the dimension 
of V. The nullity measures the degree by which A differs from a one-to-one map. If the nullity is 0, 
then A is one-to-one. 


17.4. Linear maps 371 


Example 3: The left-hand side of a system of linear equations aX%y ot + yy X_ = Oy 
defines a linear map A from the vector space R" to the vector | 

m = ' : . 

space R™, by associating with each n-tuple x = (x;, ..., X,) as its miX1 +++ + GuaXn = Om 


prop ea Sie 
A(X) = (4X1 + 7" + GiwXny = Omit +++» + GmyXn). It is easy to check that this map is in- 
deed linear. The problem of solving the system of equations can now be interpreted in the fo 
manner: ee ee » a) in to find all the vectors (n-tuples) in R” that 
are mapped onto (b,, ..., 5,) by A. The associated homogeneous Cag pbs g qucctctlon -pomdaed 
a fit oe eae The vector space of solutions of the homogeneous sys 
is the kernel of A. If the nullity of Sih 0 teen he rcasgenonas ‘eyesore teen ely tua tetviel 
solution, and the inhomogeneous system has at most one solution, because A is one-to-one. The 
image of A is the set of vectors (6,, ..-, 6,) for which solutions of the system exist. The rank of A 
can be computed from the coefficients of the equations (see Rank of a matrix). 


One-to-one linear maps from a vector space V onto a vector space V’ play an important role 
in linear algebra. Such maps are called isomorphisms. If A is an isomorphism from V to V’, then 
the inverse map is an isomorphism from V’ to V. The spaces V and V’ are called isomorphic, in 
symbols V = V’. Isomorphic vector spaces have identical algebraic properties. Isomorphism is 
an equivalence relation on vector spaces, that is, it is reflexive, symmetric, and transitive. 


Examples of linear maps: 4. rae Mig ccs ede wetgcmsccie de taking every vector to itself, 
is linear and indeed an isomorphi: 


Ix +9) = 3 +9 = Ia) +10) and Max) =a+x=a-I(x). 


5: If V is an n-dimensional space with a basis e,, .. +» €n, then the coordinate map ® from V 
to R" takes the vector x= ae; ee a a,é, to the n-tuple (4, ..., d,) of its coordinates with 
respect to the given basis. The map @® is linear, one-to-one, and every n-tuple of real numbers 
is the image of a vector in V. Hence & is an isomorphism, and hence: Every n-dimensional vector 
space is isomorphic to R". 


The significance of linear maps. Since linear maps are compatible with the operations in a vector 
space, they make it possible to transfer an algebraic situation or problem from one space to another. 
Of particular importance are the isomorphisms, because the algebraic properties of subsets of 
vector spaces, such as linear dependence and independence or dimension, are invariant. Under 
these maps any theorem involving only these concepts is true for all spaces isomorphic to V, once 
it has been proved for V. In particular, one can exploit the isomorphism of n-dimensional spaces 
with R”. 

The coordinate map translates relationships between vectors into equations involving real num- 
bers, namely the coefficients of these vectors. However, the coordinate map depends essentially 
on the choice of a basis, hence different sets of equations may reflect the same set of relationships. 
From this point of view linear maps are important, because they describe the relationship between 
one set of coordinates and another. 


Operations on linear maps. It is remarkable that the set of linear maps from a vector space V to 
another V’ can itself be made into a vector space in a natural way. If A and B are linear maps from V 
to V’, their sum A + B is defined by (A + B) (x) = A(x) + B(x) for all vectors x in V. Similarly, 
the scalar multiple a - A is defined by (a - A) (x) = a: A(x). It is easy to check that A + Banda-A 
are again linear and that the characteristic properties of vector spaces hold. If the dimensions of 
V and V’ are m and a, respectively, the dimension of the vector space of linear maps from V to V’ 
is m:n. 

The product of two maps is defined as the result of carrying out the y—o my ee 
maps in succession. For this to have a meaning, the first map must be NN 
defined on the image of the second; in other words, if B is a linear a 
map from V to V’ and A one from V’ to a vector space V’”’, then one 
obtains the map A : B by applying B to the vectors of V and then A to (A- ee 
the result. Thus, A - B is a linear map from V to V” (Fig.). 17.4-3 The product A: B 

The fact that A - B is defined does not mean, in general, that B- A _ of two linear maps 
is also defined. And even if they are both defined, they need not ne- 
cessarily be equal (Fig. 17.4-4). Thus, multiplication of maps is not commutative. It is, however, 
associative and the distributive laws also hold: 


A°(B+ C)=A:+‘B+A°:C and (A+ B):C=A:-C+B:-C. 
Special linear maps. Of particular interest in the investigation of the structure of a vector space V 


are linear maps of V into itself. These are called linear operators or linear transformations on A 
The linear transformations of an n-dimensional vector space V form a vector space of dimension n*. 


372 17. Linear algebra 


Z 


DOfz)+O(x) he 


17.4-4 Rotations of the sphere 
about the x-axis (D(x)) and about 
the z-axis (D(z)); D(x) : D(z) takes 
the point P to the point P;, on 

the other hand, D(z) - D(x) takes P 
to P,, which is different from P,; 


Furthermore, for two linear transformations A and B of the same space V the condition for the 
products A: Band B: A to exist is always satisfied. Therefore a (non-commutative) multiplication 
is also defined on the set of linear transformations of V. An example of a transformation is the 
identity map J on a space V, which is called the identity transformation. This transformation is an 
isomorphism of V onto itself. Linear transformations that are isomorphisms of V onto itself are 
called regular (or non-singular); if a linear transformation is not an isomorphism, it is called singular. 
Regular transformations have the property that the image of a basis is again a basis. They can also 
be characterized in the following way: A linear transformation is regular if and only if there exists 
a linear transformation such that A: B = B-: A =I. In this case B is uniquely determined; it is 
called the inverse transformation to A and is denoted by A~!. Thus, A~? is the transformation that 
reverses the effect of A. For instance, J~-! = J. Examples of regular transformations are rotations 
of the plane about the origin. The inverse transformation is then the rotation through the same 
angle but in the opposite direction. In the same way rotations about an axis through the origin 
are regular linear transformations of the three-dimensional space V3. 

The transformation A + B need not, in general, be regular even if A and B are. On the other 
hand, the product A: B or B- A of two regular transformations is always regular. The inverse 
transformation to A- B is B-1!- A~!. The set of regular transformations therefore has under multi- 
plication properties similar to those of the non-zero real numbers (except that the commutative 
law A: B= B: A is not satisfied). The identity transformation plays the same role as the number 1, 
since A: J] = I]+- A =A for any transformation A. In abstract algebra a set with these properties 
is called a group (see Chapter 16.). The set of regular transformations is called the general linear 
group on V and is denoted by GL(V). 

If the vector space is Euclidean, one can associate with every linear transformation A a second 
transformation A*, which is uniquely determined by the condition that A(x): y = x: A*(y) for 
all vectors x and y in V. This A* is called the adjoint transformation to A. The following rules hold 
for adjoint transformations 

(A + B)* = A* + B*, (A - B)* = B*: A*, (a> A)* =a: A*, (A*)* = A. 
Of particular importance are self-adjoint or symmetric transformations. They are characterized by 
the property that A* = A. These transformations occur frequently in physical problems and have 
in certain respects a very simple structure. The analogous transformations for infinite-dimensional 
complex vector spaces, the so-called Hermitian transformations, play an important part in quantum 
mechanics. Trivial examples of symmetric transformations are the multiples a-J of the identity 
transformation. 

In a Euclidean vector space V the inner product can be used to define the lengths of vectors and 
the angles between them. Thus, in the investigation of such a space those linear transformations 
are particularly useful that are compatible with these additional properties of V. Such transformations 
are Called orthogonal transformations. A linear transformation A on V is orthogonal if it leaves the 
inner product invariant, that is, if x - y = A(x) ° A(y) for all vectors x and y in V. An orthogonal 
transformation preserves the lengths of vectors and the angles between them. The rotations of the 
plane and of three-dimensional space are again examples of orthogonal transformations; they 
obviously preserve lengths and angles. Orthogonal transformations can be characterized in the 
following way: A linear transformation is orthogonal if and only if the image of an orthonormal basis 
is again an orthonormal basis. Shorter and algebraically more succinct is the following description: 
A linear transformation A is orthogonal if and only if A* = A-1. Since according to this definition 
every orthogonal transformation has an inverse, orthogonal transformations are regular. The 
inverse of an orthogonal transformation is orthogonal, and the product of two orthogonal trans- 
formations is also orthogonal: the set of orthogonal transformations forms a group. This is called 


17.5. Matrices 373 


the orthogonal group on V. All the orthogonal transformations of V3 can be obtained as rotations 
or as products of rotations and reflections in planes. A rotation is completely determined by the 
rotation induced in a plane perpendicular to the axis. This fact can be used to find special matrices 
representing orthogonal transformations on V3. 


17.5. Matrices 


The properties of the solution set of a system of linear equations 
depend essentially on the coefficients a,, of the system. The rectan- 
gular array of these coefficients in their m rows and n columns is : - 5 
called an m X n-matrix A (read ‘m-by-n matrix’). The numbers in 711 TF AmnXn = Om 
such an array are called the entries or elements of the matrix A. ( en tee ) 


QyyXy te + AynX, = Dy 


The n-tuple of entries with the same first subscript i is calledthe 4 — yt 
ith row and the m-tuple of entries with the same second subscriptj 7 et ee 
is called the jth column of the matrix. a _ 

An ™m X n-matrix has m rows and n columns. If m = n, the matrix is called square. The notation 
for the matrix A is often abbreviated as A = (a;;) and sometimes as A = (4;))m, to indicate the 


number of rows and columns. 


Operations on matrices. Matrices of the same shape (that is, the same number of rows and the 
same number of columns) can be added. The sum of two such matrices is defined as the matrix 
whose entries are just the sums of the corresponding entries of the original matrices. A, matrix can 
be multiplied by a real number by multiplying every entry by that number. 


Example I: 
i=3°6 »(2 1 -2 i—2. 6 ee 
Bs 1 5) +2 (3 =) a ie 1 )+(=% 4 7 eS ee 


The usual rules hold for addition of m x n matrices and multiplication by scalars. 


The set of m * n-matrices forms a vector space of dimension mn. 


The null vector of this space is the null matrix or zero matrix, whose entries are all zero. 

Matrices cannot always be multiplied. The product AB of an m X n-matrix A by anr X s-matrix B 
is defined only if n = r. In that case the product is an m X s-matrix C = (c,;), with entries defined 
in the following manner: 


Cry = 051515 + Qy2b2y + +++ + Aindny = 2 Ain dyy. 


The entry c,;; can be interpreted as the inner product of the ith row of A with the jth column of B. 


Example 2: 
1 —1\ /21 0 = arg ee 1-1+(—1)-0 bee ee 1 ) 
(; me 0 )- 2°>2+ O-2 21+ 0-0 2-0+ O-(-1)/) \4 2:0 


In general, the existence of the product A - B does not imply that B- A is also defined, and even 
if both are defined, they need not be equal, as is shown by the following calculation: 


(6 3)°(¢ 3} 4 2\ ie do lel 2 6 
oi) (s Gr a ™ (iy o)'(0 1) 71 3) 
Like the multiplication of linear maps, so the multiplication of matrices is not commutative. Apart 


from this exception the usual rules (such as the associative and distributive laws) hold for the multi- 
plication of matrices. 


374 17. Linear algebra 


Square matrices of the same size can always be multiplied. There 


is a special m X n-matrix, the unit matrix or identity matrix I, which Gee : 

leaves any n X n-matrix fixed under multiplication on the left or on the 7=-{9 1-0 

right: /-A=A°I=A. S ois i ms 
The set of m X n-matrices thus has properties similar to the set of Bee 


transformations of a vector space. By analogy to the definition of a 

regular transformation, a square matrix A is called regular (or non-singular) if there is a (necessarily 
Square) matrix B such that 4B = BA = J; otherwise A is singular. The matrix B is then uniquely 
determined, it is called the inverse matrix of A and is denoted by A~1. Rules for calculating the 
inverse are discussed below. 


Example 3: ae ge es of 1/3. —1/3 | 2 1\ (1/3 —1/3 ar) 
4=( y) * t= (iy ap) ® (1 a) (us 20) (0 4): 


Just as with linear transformations, inverses and products of regular matrices are regular, and the 
equations (AB)-! = B-!A-—! and (4—')-! = A both hold. 


The set of all regular n < n-matrices is a group under matrix multiplication. It is called the general | 
linear group of degree n and is denoted by GL(n). It is isomorphic too, but conceptually different 
from, the group GL (VP). 

With every m X n-matrix A one can associate - sae 
an nm X m-matrix AT, the transposed of A. It is re 411 +++ @in oa a A ee 
obtained from A by interchanging the rows and the saber, ee : A esa VW : 
columns: Qmy +++ Onn Qin «++ Ann 

For a square matrix this process can be easily visualized as reflection in the main diagonal, from 
the top left-hand to the bottom right-hand corner. 


Examples: 4. 1 brow 5; | Me Nk 
«.-(2>g) Af=!0 0O ) a-( as a5=() ° i) 


: 
1-1 2 1 22 
The rules governing the transposition of a matrix are similar to those for taking the adjoint of 
a linear transformation. (A + B)? = A? + BT; (aA)? =a:Al; (A: B)™ = BT - AT; (AT)T = A. 
If A is a regular matrix, then so is A’, and the inverse of A? is the transposed of A-}: (471)T 
= (AT)-!, The matrix (AT)~! is called the contragredient matrix to A. 


The determinant of a matrix. Computation of the yy vo» Ay, Ay, «++ Ay 
inverse matrix. With every square matrix A there A= % : |, detA= : 
is associated a real number, the determinant of A ere oe Giny +o> Ope 


(see Determinants): There is a remarkable connection bet- 
minants, the product theorem. 

Since the rules for calculating determinants immediately give det J = 1 when / is the identity matrix, 

it follows that if A is regular, then (det A) (det A)~! = 1. Therefore the determinant of a regular 


matrix is non-zero. The converse is also true. If the determinant of a matrix is not zero, then A is 
regular. This is made explicit by a sormula for computing A. 


Wa ee RL &2 
A=( 2-1 1}; at=—-(-4 5 ) es a i} =—4, 
—2 01 —2 0-1 


The number —4 = Aj, say, in the right hand matrix is the cofactor of a,2 = 0 in A. 


17.5. Matrices 375 


Example 7: Computation of the inverse of a regular 2 x 2 matrix 


41; 42 os l 422 —@,2)\ 
GQ, 422 @; 1422 — 4,242; \—a, ayy 


Apart from this method of finding the inverse matrix by the cofactors, it can also be done by solving 
a suitable set of equations. The equations are obtained by considering the entries of 4~! as unknowns 
in the matrix equation (AA7! = J) 


("! eee *) " eee 2 (: eee | 

Oe in Tag) Nag sen Xn 0 1 

After multiplication of the two matrices on the left one obtains a system of ? linear equations to 
determine the n? unknowns x,;. The solution of these equations by Cramer’s rule gives the cofactor 


formula above for A7?. 


It is also possible to compute the inverse A~! by considering the system of 7 equations in the 2n 
variables x, ..., Xny Vis --+> Yn: 


ra ie ale less ~ fa This system can be solved by Cramer’s a = Pris a eee PinYn 
Oni X eres a,x = Yo rule or by Gauss’s algorithm: ee = iy ee i). 


The matrix B = (5,,) of the coefficients on the right-hand side is then the inverse of A. This method 
requires fewer equations. 


Representation of linear maps by matrices 


The operations on matrices show remarkable similarities to those on linear maps. This is true 
not only of the relatively simple operations of addition and multiplication by scalars, but also of 
the multiplication of the linear maps and matrices themselves, in particular, the conditions for the 
existence of the product, inverse etc. These similarities, already emphasized by a similar terminology, 
are not accidental. Indeed, the importance of matrices lies to a large extent in the fact that they 
can be used to describe linear maps numerically. This aspect subsumes the use of matrices in de- 
scribing systems of linear equations. A linear map A from an n-dimensional vector space V to an 
m-dimensional vector space V’ can be represented by an m xX n-matrix in the following way: If 
X1,---,X, and y;, .--, Ym are bases of V and V’, respectively, then the images of x,,..., x, can be 
expressed in terms of the basis y;, ..., Yn: 


A(X) = 44191 + °** + amin m 
: : : or A(x;)= Day, for j=1,...,n. 

A(Xn) = G11 + 0+ + Omn¥n = 

Now the linear map A is completely determined by the images A(x,) of the basis vectors x;, ..., Xn3 

for an arbitrary vector x = a,x, + --- + a,x, of V then has the image A(x) = A(a,x + --- + a,x,) 

= a, A(x,) + -+- + a,A(x,). Thus, the linear map is completely characterized by the m + n numbers a;;. 


It turns out to be more convenient to use as the matrix representing A the transposed of the 
coefficient matrix above 


. a The jth column of A is simply the set of coordinates of A(x;) with 
Aa AS Re respect to the basis y,, ..-, ¥,. It is important to remember that the 
7 ioe: choice of the matrix representing A depends on the choice of bases 

mi -°> “ma 


in Vand V’. 

If the bases in V and V’ are fixed, then the correspon- | 
dence between linear maps and matrices has the adjacent shes ewan par wed 
properties: ) | . 
Under the same condition there is a unique m xX n-matrix associated with every linear map, and 
vice versa. These statements are summarized in the following theorem: 


The vector space of linear maps from V to V’ is isomorphic to the vector space of m * n-matrices. 


A similar fact emerges for multiplication. If A is a linear map from V’ to V” and B a linear map 
from V to V’, then by choosing bases in V, V’, and 
V’” matrices are associated with A and B, and itcan If A + Aand B— B,thenA-B-— A- B. 
be shown that: 

The representation of linear maps by matrices is completely analogous to the representation of 
vectors by n-tuples with respect to a basis. The coordinates of a linear map are only arranged in a 
special way to make up a matrix. The analogies in the operations now appear as consequences of 
the fact that the operations for matrices are defined to correspond to the operations on the linear 
maps they represent. 


376 17. Linear algebra 


If a linear map from V to V’ is represented by the m xX n-matrix A = (a; J). with respect to fixed 
bases in V and V’, then the vector equation A(x) = Xg can be solved. Here it is required to find all 
vectors x in V that are mapped by A to a given vector Xo of V’. 

If b,, ..., by, are the coefficients of x9 in V’, then the problem of finding the coordinates, x,, ..., X,,; 
of such a vector x is simply that of solving the system of equations 


This shows the connection between the equations given by linear maps and systems of linear 
equations. If the system is written in matrix form, it becomes evident that it is merely the vector 
equation A(x) = Xo in coordinates. 

Here the coordinates of x and xo are written as matrices with a single column. 


Representations of linear transformations. To associate a matrix with a linear transformation 
of an n-dimensional vector space V, it is sufficient to choose a basis x,, ..., x,. From the equations 


A) =a, 1*1 we AniXn 
Ay) = AypXy +o + dagXp 


211 --- Gin 
one obtains the matrix representing the transformation A > A = ( : : ) 
Be). ane Can 
Linear transformations are always represented by square matrices. 


If a linear transformation is regular, then so is the matrix representing it, and vice versa. The 
inverse transformation is represented by the inverse matrix. 


If A» A, then A~-* + A“', 
If A— A and B— B, then A + B+A-4+ B:a-A—-a-A;andA-B-+A- B. 


Example 8: If A is the transformation already mentioned repeatedly that is obtained by rotating 
the plane about the origin O through the angle g, and if x, and i penjhiitchcabarn ey Gems 
of length 1, then their representatives in the point O are mapped to representatives o of their images 
A(x,) and A(x,) (Fig.). Obviously the adjacent equations hold for A(x,) and A(x). The Comair 
4 The operator 4-1 simply the | 40H) = 08 9x1 + Sin px 
rotation through @ in the opposite A(x2) = —sin px, + COS Px2 
direction, that is, a rotation i 
through —@. 


n 
or A(x,) = > Qj Xj for j= 1,....” 
i=1 


At» An? my boas —at® ). 
B= 9) | Es Ne a ee 17.5-1 Rotation of a pl 
_ Example 9: If I is the identity transformation on Vand x1,-.%n about the fined point -O 
is any basis of V, then through the angle 
U(x;) = x, = 1-x, + 0+x,+-+0°x, od 
AXa) 9 SSS eee h pacha aves Xn fagend. 1-08 
I(x,) = X, = 0x, +0° ae s+ 1X, 00... 1 


For any basis the identity transformation is represented by the identity matrix. 


In ssi the matrix A representing a linear transformation A depends on the choice of basis. 

If x,,..-, X, and x}, ..., X, are two bases of V, then, say, 
A — A with respect to the basis x,, ..., x,, and 
A -» A’ with respect to the basis x}, ..., x,. 

A new linear transformation C can now be defined by means of the two bases: C(x,) = x}, ... 
C(Xn) = = x,, that is, C is the transformation taking one basis to the other. If the transformation C 
is represented by the matrix C with respect to the basis x,, ..., X,, then the relation A’ = C-!AC 
holds. This is the rule of transformation for matrices representing the same operator with respect 
to different bases. Matrices for which the above relation holds are called similar. A natural question 
in this context is that of the existence of a basis for which the matrix representing a given trans- 
formation is as simple as possible. This is the problem of finding normal forms for transformations 
and is closely connected with the theory of eigenvalues (see Transformation to principal axes in 
Eigenvalues). 


17.5. Matrices 377 


Change of coordinates. If in a vector space V two bases x,,..., X, and y;,..., y, are given, then 
a vector x has coordinates with respect to each of these bases: 


MS XyXy tee + XpXn = Vir +s + VaYn- 
The change from one coordinate system to the other is described by the equations: 
V1 = AyyXy +--+ + Ani Xy n 
: : : or Vy = LD Ay yx; for j= 1,...,n. 
Yn = AynX, + °° + AnnXn el 
The coordinates x,, ..-, X, and y;, .--, ¥, now Satisfy the relations 


n 
x,x= Day, for j=1,...,n. 
i=l 


The inverse formulae are obtained by going from the matrix A = (a,,) to its inverse A~ = (q;,) 


The difference in the way they are transformed is expressed by calling the transformation of the 
coordinates contragredient to that of the bases. For if the basis y, , ..., y, is represented by the matrix A 
with respect to the basis x,,...,X,, then the coordinates of x with respect to the basis y,, ---, ¥n 
are obtained from those with respect to x,, ..., xX, by the contragredient matrix (A~')', as is shown 
by the equations above. 

For orthonormal bases in a Euclidean vector space the transformation matrix is orthogonal 
and hence equal to its contragredient. In this particular case coordinates are transformed in the 
same way as bases. 


The rank of a matrix. For any m x n-matrix A one can determine the maximal number of linearly 
independent columns or rows by considering the rows and columns as elements of R” and R”, 
respectively. These two numbers are always equal and are called the rank of the matrix. If the matrix A 
represents the linear map A, then the rank of A is the same as the rank of A. The rank can be com- 
puted by using the following facts: 


The rank of a matrix remains unchanged if 1. a multiple of one row (column) is added to another 
row (column), or 2. rows (or columns) are interchanged. 


By using these rules a matrix can be brought into a form in which only entries with the same 
row and column index can be different from zero. The rank of A is the number of such non-zero entries. 
This method is very similar to Gauss’s algorithm of solving systems of linear equations. For a 
quadratic matrix it is sufficient to transform it to triangular form in which all the entries below 
(or above) the main diagonal are zero. If this is done so that as many diagonal elements as possible 
are non-zero, then again the number of such elements is the rank. 


Example 10: The matrix A is transformed into A, by adding the second column to the first 
and third. By subtracting three times the first from the third one obtains A;. Interchanging the first 
two columns gives A,. The rank of A is 2. 


Sat has eae 0 1 0) 010 100 
i = eee sy ea oe ye pe | 
‘ Gris ) As hers : kal : ae) 


Example 11: In the initial matrix A, three times the first row is added to the second, and twice 
the first row to the third. In the transformed matrix the second and third column are interchanged. 
The rank of A is 3. 

I 1s i Fae eek | 
A=|-—3 -—3 1}e/|0 O 1/]/e/0 1 0 
\-—2 1 0 03 0 0 0 3 


Special types of matrices. Corresponding to the special types of linear transformations there are 
special types of matrices. If V is a Euclidean vector space and if a transformation A is represented 
by the matrix A with respect to an orthonormal basis, then the adjoint transformation A* is represent- 
ed by the transposed matrix A’. Hence symmetric transformations, for which A = A*, are represent- 
ed by symmetric matrices, for which A = A’. 


378 17. Linear algebra 


Of particular importance are orthogonal matrices, because they transform orthonormal bases 
into one another. Expressed in terms of coordinates this says that: The coordinates with respect 
to one rectangular coordinate system are transformed into those with respect to another by means of 
an orthogonal matrix. A matrix is orthogonal if A™ = A-1. This equation can also be written in 
the form A: A? = / and interpreted thus: Jn an orthogonal matrix the inner product of different 
rows is zero, the inner product of a row with itself is one. The same statements are true for the 
columns of A, and either set is a sufficient condition for the matrix to be orthogonal. For instance, 
every 2 <X 2 orthogonal matrix can be written in the form: 


( g —sin 4 ( y sin 4 
or . 
sin 9 COS p sing —cos@ 


In the first case the matrix represents a rotation of the plane through the angle q, in the second 
case there is an additional reflection in a line. Matrices of the second type can be distinguished from 
those of the first by the fact that their determinant is —1, whereas the determinant of a rotation is 
always +1. In general, the determinant of a orthogonal matrix is always +1 or —1. If an orthogonal 
matrix has determinant +1, it is sometimes called proper, in general they correspond to orientation- 


preserving orthogonal transformations of a Euclidean vector space. The following matrices are of 
this type: 


cosy —sing 0 cosy O —sinyp 1 O 0 
A;2(y) = | sing cosm O], A,3(y)=[0 1 O , 4,,(9)=[0 cos® —siné 
0 0 1 siny O cosy, 0 sind cos # 


Here 9, y, and @ are arbitrary angles. If a fixed order of the basis vectors e,, e2, e3 is chosen, then 
A,2(9) represents a rotation of space about the e3-axis. The e,, e2-plane is rotated through ¢ while 
e; is left unchanged. This fact gives rise to the special form of the matrix. Every proper orthogonal 
3x 3 a A can be written as a product A = A23(8) - A;3(y) - A12(y) for suitable choices of 
Q, y, and #. 

Just as for the orthogonal transformations, so the set of orthogonal m < n matrices forms a 
group. The proper orthogonal matrices form a subgroup of this group. 


17.6. Eigenvalues 


Eigenvalues and eigenvectors. A number / is called an eigenvalue (or characteristic value) of a 
linear transformation A if there exists a vector x + o such that A(x) = A--x. The vector x is then 
called an eigenvector of the transformation A belonging to A. The eigenvectors belonging to A together 
with the null vector form a subspace, called an eigenspace of A. 

If the equation A(x) = A- x is rewritten in the form (A — AJ) x = 2a, then it can be stated that: 


In this formulation it is possible to define an eigenvalue in terms of a matrix A representing the 
transformation A: A number A is an eigenvalue of the matrix A if A — AI is singular. 


Example |: Let A be a singular transformation; then there exists a non-zero vector x such that 
A(x) = o = 0- x. Hence A = 0 is an eigenvalue of A, and the non-zero vectors of the kernel are 
the eigenvectors belonging to 0. 


Example 2: Suppose that the matrix A repre- . = 4, 0 ... 0 
senting the operator A with respect to a basis ees ees As de (10 Ay 220 
X1, «++, X, iS diagonal: EO tents! De 

Then the basis vectors are all eigenvectors of a) = Onn incase 


A. Such transformations are particularly easy to 
describe, because they change the basis vectors only by multiplying them by scalars. They are 
called diagonal (or diagonalizable) transformations. Every transformation of an n-dimensional 
space with n distinct eigenvalues is diagonalizable. 


The significance of eigenvalues in physics. Eigenvalue problems are important in many branches 
of physics. They make it possible to find coordinate systems in which the transformations in question 
take on their simplest forms. In mechanics for instance, the principal moments of a rigid body 
are found with the help of the eigenvalues of the symmetric matrix representing its inertia tensor. 
The situation is similar in the mechanics of continua, where the rotations and deformations of a 
body in the principal directions are found with the help of the eigenvalues of a symmetric matrix. 
Eigenvalues are of central importance in quantum mechanics, in which the measured values of 


17.6. Eigenvalues 379 


physical ‘observables’ appear as the eigenvalues of certain operators. The term ‘transformation’ 
is used predominantly in pure mathematical (geometrical) context, whereas ‘operator’ is more 
customary in applications (physics, technology). 


Computation of eigenvalues and eigenvectors. If a basis in a vector space V is chosen, then the 
equation (A — AJ) (x) = 0 is represented by the following system of equations for the coordinates 
X15 +++) Xn Of X: 


(a4, — A) X%1 + 412 Xg te + ayy X, = 0 
421% + (422 —A)xXgte54+ 2n x, = 0 
Oni Xs + 4n2 Kato t Gi. — A)x, = 0. 


The coefficient matrix is the matrix A — AI representing the transformation A — AJ. Since only 
non-zero vectors can be eigenvectors, the problem is to find non-zero solutions of this homogeneous 
system. A necessary and sufficient condition for the existence of such solutions is that the deter- 
minant of the matrix of coefficients should vanish: det (4 — AJ) = 0. This is the case if and only 
if-A — Alis singular, that is, if A is an eigenvalue of A. The determinant can be seen to be a polynomial 
of degree n in A: 

det (A — ANI) =anp + a,A+--- +4,A". 
This is called the characteristic polynomial of the matrix A. If A’ is another matrix representing A, 
then A’ = C-1AC for some matrix C, and its associated polynomial is the same: 


det (A’ — AI) = det (C-'AC — AI) = det (C-"(A — AI) C) = det (A — Ad). 


To find an eigenvector x one must therefore first find a root of the characteristic polynomial of A. 
The coordinates x, X2, ---, X, of x can then be found as a non-trivial solution of the homogeneous 
system given above. 


3 
Example 3: For n= 2 and A = ( 4 the eigenvalues are roots of the equation: 


2—A 3 
—1 —2-—A 


Thus, they are +1 and —1. The coordinates x, , x2 of the eigenvectors belonging to the eigenvalue 
+1 are the solutions of the system: 


Spe bs Se co where t is an arbitrary non-zero number, In 
; : 3 oe > (x;,X2) =t*(—3,1), general, eigenvectors are determined only up 
gk Ce ms to scalar multiples. 


det (4 — 21) =| |-#-1=0 


The transformation to principal axes 


For symmetric transformations the theory leads to a particularly simple result. All the eigenvalues 
of a symmetric transformation are real and there exists an orthonormal basis of eigenvectors. If 
A is represented by the matrix A, this means that there exists an orthogonal matrix C such that 
A’ = C~!AC is diagonal, with the eigenvalues on the main diagonal. A’ is called the normal form 
of A and the change of basis represented by C is called the transformation to principal axes. The 
matrix C is the matrix of the coordinates of an orthonormal basis of eigenvectors with respect to 
the basis under which A is represented by the matrix A. 

Example 4: For A A = : ) the eigenvalues are +-2 and +-4. The eigenvectors belong- 
ing to +2 are (x, , x2) = T, * (1, 1) and the eigenvectors belonging to +4 are (x,, x2) = T2(—I, 1). 
The numbers t, and tz can be chosen to give the vectors the length 1. The eigenvectors (1/\/2, 1/\/2) 
and (—1//2, 1/2) form an orthonormal basis, and C is the matrix 


co(if tap ermer=(Mm ty oven 2 


By means of the transformation to principal axes the equation of centred conics or quadrics 
can be considerably simplified, by changing the Cartesian coordinate system to one consisting of 
symmetry axes of the curve, or surface. These are the principal axes of the figure, which explains 
the name transformation to principal axes. 


Example 5: eh Be aks TO ie Caen: ES come Sete, see a b 
the coordinates on the left-hand side are arranged in a symmetric matrix A. | A = 
Under a coondiants transformation to new rectangular coordinates (x’, »’), by \o_c 


380 17. Linear algebra 


an orthogonal matrix C = (c,,) the matrix A is transformed to A’ = C™AC = C-'AC. Thus, by 
choosing a suitable matrix C, A’ can be made diagonal 


ae A, 0 
% CraX 1 Caiy, c= (% i A’ = C140 =( : ). 
yY = Cy2* + C22y ” a 


This means that in the new coordinate system the curve is described by the equation A, x’?-+-A,y’? = d. 
For example, let 3x? — 2xy + 3y? = 2 be the equation of a curve. The transformation matrix C 
of the corresponding symmetric matrix A was found in Example 4: 


Cz, C22 


ee 
3 +4 ae 

ma A: pO | Sey 

y2 V2 

E l I Ragcieretts (ae 


en ye ye 


Pes gt Se ce egress 


The coefficients of the last two equations are the entries of C~* 
= C7’. If the expressions for x and y are substituted in the equa- 
tion, the resulting equation for the curve in the new coordinate 
system is 2x’? + 4y’? = 2. 


17.6-1 Transformation to 


The matrix C describes a rotation of the plane about the origin erie arian = 


through an angle of 2/4, which takes the old coordinate axes into = 2x’? + 4y = 2 
the new ones (Fig.). 


17.7. Multilinear algebra 


The principal object of multilinear algebra is the investigation of multilinear forms, which are 
generalizations of linear forms. A multilinear form on a vector space V is a function that associates 
with any r vectors a number and is linear with respect to each variable. This means that if any 
r — 1 vectors are fixed, the mapping so defined is linear in the last vector. 


Bilinear forms. If r = 2 the form is called bilinear. An example of bilinear form is the inner product 
of vectors. If a basis of the space V is chosen, the bilinear form can be expressed in coordinates; 
for example, in the case of a two-dimensional space the general expression for a bilinear form is 


B(x, Y) = 44 4X Vy + Ay 2X1 ¥2 + A21X2¥1 + A22X%2y2- 


If one puts x = y, one obtains a quadratic form a,x? + a42X1X2 + G24X2X, + a22x3. The most 
important problem in the theory of quadratic forms is to express the given form in the simplest 
possible way, for instance, in a form without mixed terms. This can always be done by means of the 
transformation to principal axes. 


Tensors. The coefficients of a bilinear form exhibit a regular behaviour under transformations 
of that form, which is characteristic of tensor coordinates. By generalizing the concept of a vector 
space in linear algebra one defines tensor spaces, whose elements are then called tensors. 


Applications. Tensor algebra, the investigation of tensor spaces, has an important application 
in differential geometry. There the curvature of a surface or of a space is described by a tensor, 
the curvature tensor. In the theory of relativity the impossibility of separating the energy and impulse 
of a particle is reflected by the existence of a tensor whose components are the energy and the com- 
ponents of the impulse, the so-called energy-impulse tensor. Tensors are also useful in other areas 
of physics, for instance, in crystal optics and elasticity theory. Thus, the deformation or tension 
of an elastic medium is described by the deformation or tension tensor. 

The theory of bilinear and quadratic forms is used in analytic geometry to arrive at the standard 
classification of conics and quadrics. It is also used in physics, particularly in the description of 
physical systems subject to small vibrations. 


18.1. Sequences 381 


18. Sequences, series, limits 


18.1. SEQUENCES 4400 s.c shew cee ede de 381 Some important limits ........... 399 

18:22. “SCViCS scuctseu esa cacelatsecdses 388 The rule of Bernoulli and de 

18.3. Limit of a function — Continuity 396 PHospital 0.0... ccc cee eee 400 
Limit of a function ..........06.5 397 Continuity of a function ......... 402 


18.1. Sequences 


From every non-empty set S of real numbers sequences can be selected by choosing from S in 
succession a first number a,, a second number a), a third number a3, and so on, and by considering 
a, to be the first term of the sequence, a2 the second, a3 the third, and so on. For example, if from 
the set of positive integers one selects in their natural order the numbers that are divisible by 2, 
one obtains the sequence of even numbers, whose first five terms are 2, 4, 6, 8, 10. In the formation 
of sequences an element of S may be chosen more than once, as for example the number 2 in the 
sequence 2, 4, 2, 6, 2, 8, 2, 10. If the same number a is always chosen, one obtains a constant sequence 
i rer: Pee 

A finite sequence consists of finitely many, say N, terms; a, is then its last term. The sequence 
2, 4, 2, 6, 2, 8, 2, 10 defined above is a finite sequence of eight terms; ag = 10 is its last term. On 
the other hand, the sequence of even numbers has no last term, because every term is followed by 
another one. Such sequences are called infinite. 


An infinite sequence is given when to every natural number n > 1 there corresponds exactly one 
real number a,,; a, is called the nth term of the sequence. If this correspondence exists only for each 
natural number n between J and N (1 <n = N), then one obtains a finite sequence. 


A tabulated representation of this correspondence, for example, 
Term number 7 ’ ] }2 { 3 ' 4 { 5 
Term a, of the sequence 2 4 6 8 10 
for the sequence of even numbers, shows that one can regard every sequence as a set of ordered 
pairs of numbers (n, a,) whose first component 7 is a natural number, and whose second component, 


the therm a,, is a real number. Since the correspondence is single-valued, sequences can also be defined 
as functions. 


Sequences are functions whose domain of definition is a set of natural numbers and whose range 
consists of real numbers. 


Of course, the plausible graphical representation of a sequence, for example, by the sequence 
of discrete points with the coordinates (nm, a,) in a Cartesian coordinate system, or a tabulated 
representation, is as unsuitable for the complete description of an infinite sequence as is the enumera- 
tion of some of the initial terms of the sequence. For instance, the terms a, = 2, a, = 3, a3 = 5 
can be continued in a sequence in many, even in infinitely many ways. Examples of such continuations 
are the sequence of prime numbers, the finite sequence of all the factors of 210, or the sequence 
2, 3, 5, 8, 13, 21, ..., in which the Ath term for k > 2 is the sum of the two preceding terms. 

For the complete description of an infinite sequence one tries, therefore, to represent the unique 
correspondence between the term number 7 and the corresponding term a, of the sequence by a 
defining law. In most cases it is possible to state the defining law by means of an analytical expression 
a, = f(n), n = 1, 2, 3, ... One can then denote the sequence a,, a2, a3, ... by {a,} = {f(a)}. 


Exaniples of sequences whose defining law can be stated by means of an analytical expression. 

1, The sequence 2, 4, 6, ... of even numbers has the defining law a, = 2n. 

2. The sequence 1, 4, 9, ... of perfect squares: {a,} = {n*}. 

3. The seventh term of the sequence {a,} = ey 1)} is obtained by substituting n = 7 in 
the analytical expression to give a, = 7/(7 +- 1) = 7/8. 

4. The sequence {a,} = {2"} for 1 <n < 10is ee oie e a= 27*° = 1074, 

5. The defining law a, = (—1)"*! n leads to the sequence I, —2, 3, —4, 5, —6, ... This sequence 


is alternating, that is, neighbouring terms have opposite signs. This example also shows that an 
infinite sequence does not necessarily have a largest or a smallest term. 


Sometimes the defining law of a sequence can be given by means of a recurrence relation, from which 
a term a, can be calculated only when the preceding terms a; with i<_ 7 are already known. For 
example, the sequence 0, 1, 1, 2, 3, 5, 8, 13, 21, ... of the Fibonacci numbers is defined by a, = 0, 
az = 1, and for n > 3 by the recurrence relation a, = d,_, + Q@,_2. 


382 18. Sequences, series, limits 


However, there are sequences for which neither an analytical expression nor a recursive law can 
be given, for example the sequence of prime numbers, or the sequence 3, 1, 4, 1, 5, 9, 2, 6, 5, ... 
whose nth term is the mth digit of the decimal expansion of the number zx. From the terms of a se- 
quence one can obtain further sequences, for example, from 1, 1/2, 1/3, ..., 1/n, ... the sequence 
Ss, =1, s.=1+4+ 1/2, 535 =1+4+ 1/24 1/3, ..., s,=1+1/2+ 1/3 +---+ 1/n, ..., whose zth 
term is the sum of the first 1 terms of the given sequence. In this case the sequence 5, , 52, .--, S_, ++: 
is given by an indirect rule. 

The main interest lies in infinite sequences. Of particular interest are properties that follow from 
the relationship between successive terms. 


Monotonic sequences. These are sequences whose terms steadily increase (or steadily decrease) 
with increasing term number (see Chapter 5.). 


Sometimes in this definition equality is also allowed, and a sequence is called monotonic increasing 
if a,,; = 4, and monotonic decreasing if a,,; < a@,. To distinguish between them sequences with 
the property @y,1 > Qn(@n41 <( @,) are then called strictly monotonic increasing (strictly monotonic 
decreasing). 

The sequence 1, 1/2, 1/3, ..., 1/n, ... of fractions, for example, is (strictly) monotonic decreasing, 
and the sequence —12, —9, —6, —3, 0, ..., [—12 + 3(@ — 1)], ... is (strictly) monotonic increasing. 
Most sequences are neither monotonic increasing nor monotonic decreasing, for example, the 
sequence 1, 1/2, 2, 1/3, 3, 1/4, 4, ... 


Bounded sequences. The sequence — 1/2, 0, 1/6, 2/8, ... with the defining law a, = (n — 2)/(2n) has 
the property that none of its terms is greater than 1, and also that none is less than —1/2, so that 
the inequality —1/2 = a, < 1 holds for all 7. Such sequences are called bounded. 


Such a k is called a lower bound and K an upper bound for the sequence. If k < a, < K for every 
term a, of a sequence, then |a,| << M = Max (|k|, |K|). Conversely, if a bound M exists for the 
absolute values |a,| of the terms, |a,| < M, then —M < a, < M, that is, the sequence is bounded. 
The definition can therefore also be stated as follows: 


The numbers k, K, M are not uniquely determined. Clearly, if k is a lower bound of the sequence, 
so is every smaller number k’ < k, and if K is an upper bound, so is every greater number K’ > K. 
A finite sequence is always bounded; the smallest term of the sequence can be chosen as a lower 
bound k, and the greatest term as an upper bound K. Infinite sequences can be unbounded, for example, 
the sequence of squares 1, 4, 9, ..., 2”, ... The least upper bound is called the supremum G; every 
smaller number G — «, where « is arbitrarily small and positive, is exceeded by at least one term a,, 
of the sequence {a,}, that is, a,, > G — «. Similarly the greatest lower bound is called the infimum g; 
every greater number g + e(e > 0, arbitrary) exceeds at least one term a, of the sequence, that is, 
a, < g+e. It can be shown that every bounded sequence has a uniquely determined supremum 
and a uniquely determined infimum. 

These considerations can also be applied generally to number sets, if one replaces ‘sequence’ 
by ‘set’ and ‘term’ by ‘element’. 


Arithmetic sequences. In an arithmetic sequence the difference d between two consecutive terms 
is constant and non-zero: a, — 4,_; = d@. For example, the sequence of even numbers 2, 4, 6, 8, ... 
has the common difference d = 2. If one chooses d = —3 and the first term a, = 25, one obtains 
the sequence 25, 22, 19, 16, 13, ... If d is positive, the arithmetic sequence increases monotonically, 
and if d is negative, it decreases monotonically. Every infinite arithmetic sequence is unbounded. 


= 


— —— = 


The name is derived from the fact that every term a, (k > 2) is the arithmetic mean of its two 
neighbouring terms: clearly the mean of q_; =a, —d and a&,,; =a, +d is (Qy_1 + a,,)/2 
== (2a,)/2 = dy. 


Example I]: The arithmetic sequence with the first term a, = 33 and the common difference 
d = 8 has the 100th term a,o9 = a, + (mn — 1) d = 33 + 99-8 = 825. 

Example 2: If ayo = 15 is the 10th term of an arithmetic sequence whose common difference 
is 2, then the first term is a, = a, — (pn — 1)d= 15 — 9-2 = —3. 


18.1. Sequences 383 


The process of linear interpolation consists in inserting m further terms between two terms a, 
and a,,, of an arithmetic sequence with the common difference d, so that they again form an arith- 
metic sequence. Let d’ be the common difference of the required sequence; then 


One, = a + (m+ 1I)hd’=a,+d, SO that dad’ =di(m+ 1). 


Example: To interpolate 6 terms between each pair of terms of the arithmetic sequence [3], 
, 45, 59, ---. Since d= 14, the common difference d’ of the new sequence is given by 
4/7 = 2, and this gives the sequence 3], 5. 7,9, 11, 33, 15, {i7], 19, 21, 23, 25, 27, 29, 


=] 
From a given sequence the difference sequence is formed by taking the difference between con- 
secutive terms. Thus, an arithmetic sequence can also be described as one whose first difference 
sequence is constant. In practical mathematics and in the calculus of errors and approximations, 
arithmetic sequence of higher order, for example, of the nth order, are used. In these the nth dif- 
ference sequence 4” is the first one that is constant. 


Example: The sequence 1, 8, 27, 64, 125, 216, ... is arithmetic of the third order, since its third 
difference sequence A? is constant. 


Sequence | 8 27 64 125 216 
A} 7 19 37 61 91 
A? 12 18 24 30 ess 
A? 6 6 6 ae 


Geometric sequences. In a geometric sequence the ratio q+ 1 of two neighbouring terms is 
constant: a, = a,_,q. For example, the sequence 9, 3, 1, 1/3, 1/9, ... has the first term a, = 9 and 
the common ratio g = 1/3. For a, = —1/2, g = —2, one obtains the sequence —1/2, 1, —2, 4, ... 
and for a, = —24, q = 1/2, the sequence —24, —12, —6, —3, ... If g is positive, all the terms have 
the same sign as a,; if g is negative, the sequence is alternating. Geometric sequences are bounded 
if |g] < 1, and are otherwise unbounded. They increase monotonically for a; > 0, g > 1, and also 
for a; < 0,0< q< 1. They decrease monotonically for a; > 0, 0< gq <1, and also for a; <0, 
q> 1. 


The name of the sequence is derived from the fact that every term a, (k > 2) is numerically equal 
to the geometric mean of its two neighbouring terms: for a,_; = a,/q and a,,; = a,q have the geo- 
metric mean ) [(a,/q) (a,9)] = (az) = |a,|. 


Example 1: The geometric sequence with the first term a, = 2 and the common ratio g = 1/2 
has the 10th term a;9 = a,qg° = 2+ (1/2)? = 1/256. 

Example 2: If the first term of a geometric sequence Js a, = 2/3 and its 10th term is 
aio = a.q° = 13122, then the common ratio is given by qg = V(4@yo/a,) = yG 7 13122/2) = 3, 

Example 3: If a sufficiently large piece of paper of thickness a, = 0.1 mm (0.003937) is folded 
40 times, a layer of paper is obtained of thickness d = a4, = 0.1 mm x 2*° = 109951 162777.6 mm 
e 109951 km = 68335 miles. 

Example 4: In passing through a glass plate a light ray loses 1/12 of its intensity L by reflection 
at the boundary surfaces and by inhomogeneity of the material. After passing through the first 
plate it has the intensity a; = L— 1/12L = 11/12L; after passing through the second one it has the 
intensity ag = 11/12L — 1/12(11/12)L= (11/12)? L; and after passing through the nth plate it 
has the intensity a, = (11/12) L. If it is established by measurement that the intensity a, is only 
half the original value, from a, = (11/12)" L = 1/2L the number vn of plates can be calculated. 
It is found that nm = Ig 2/(lg 12 — Ig 11) ~ 8. Thus, the light ray has penetrated eight plates. 
Between any two terms a, and a,,; = a,q of a geometric sequence, m numbers can be inter- 

polated in such a way that the resulting sequence is again geometric. If g’ is the common ratio of 
m+ 
this sequence to be determined, then a,,,; = a,(q’)"+' = ag. From this it follows that g’= /4@. 

Example 1: Interpolate four terms between each pair of terms of the sequence [32], [1], 
[1/32], 1/1024 For the given sequence g = 1/32, and the common ratio of the interpolated 
sequence is g’ = j/(1/32) = 1/2. Thus, one obtains the sequence [32], 16, 8, 4, 2, 1] , 1/2, 1/4, 
1/8, 1/16, '1/53), ais 

Example 2: In tuning by equal temperament, 11 intermediate notes are arranged at equal 
distances between the notes of an octave. In C major, for example, the notes are C+, D, D#, E, 
F, F#, G, G#, A, A#, B. The frequencies of the tones form a geometric sequence between the 


384 18. Sequences, series, limits 


tones of an octave with frequency ratio g = 2. The ratio qg‘ of the required tones is obtained from 


? 


12 
g = y2 = 1.059463, giving the sequence of frequency ratios: C=1, C# = 1.05946, 
D = 1.12244, D# = 1.18921, E= 1.25992, F = 1.33792, F# = 1.41421, G = 1.49831, 
G+ = 1,587 40, A = 1.68179, A# = 1.78180, B = 1.887 75, C’ = 
Basic series of norm numbers 
R5 R10 R20 R40 R80 


5 10 20 40 80 
Y10x% 1.6 V10% 1.25 V10 1.12 V10% 1.06 y10 + 1.03 


In normalizing one tries to find gradations of magnitudes that satisfy practical requirements with 
a minimum number of steps. One uses so-called decimal geometric sequences. These are geometric 


n 
sequences with the step or common ratio q = Y10, which are called in technology the basic series 
of norm numbers. 
Accordingly every decimal region is subdivided into n steps. From the basic series one can select 
further series by using only every second, every third, or every mth step of the series. 


Series R 10 
Selected series R 10/2 


12.5 


12.5 

Technological products, aiachines and machine parts, tools etc., are manufactured according to 
these basic series. Pressures in presses, lifting forces and heights of cranes and winches, numbers 
of revolutions, cutting speeds and power of turbines are likewise graded. Internationally agreed 
paper formats are also geometrically graded, and coins and banknotes are often based on geometric 


3 
sequences with gq = 10% 2.2, giving the very approximate sequence 1, 2, 5, 10, 20, 50, ... The 
decimal coinage in Great Britain and the United States coinage conform to this scheme. 


Convergence and divergence of sequences. The terms of the sequence 1, 3/4, 4/6, 5/8, ... with the 
defining law a, = (n + 1)/(2n) differ from 1/2 by less and less as the term number 7 becomes greater. 
The difference |a, — 1/2| between the terms of the sequence and 1/2 can be made arbitrarily small; 
that is, a suitable value of the index m can be chosen, so that from this value of n onwards all 
the differences |a, — 1/2| are smaller than an arbitrarily small given positive number «. If it is 
required, for example, that the deviation from 1/2 shall be at most « = 0.001, then from |a, — 1/2| 
= |(n + 1)/Qn) — 1/2| = |1/2 + 1/(2n) — 1/2| = 1/(2n) < 0.001, it follows that all terms a, With 
n > 500 have the required property. At most 500 terms have a greater deviation from 1/2. If the 
required precision is increased to « = 0.000001, then only 500000 terms have a greater deviation 
from 1/2, and |a, — 1/2| < 0.000001 for all terms a, with n > 500000. In general, |a, — 1/2| < « 
for all n > 1/(2e). Thus, no matter how small e is chosen, it is always possible to choose an 
index such that from this term onwards all the terms of the sequence differ from 1/2 by less 
than e. The sequence {a,} is then said to converge to the limit 1/2. _ 


The number N beyond which lan — a| < € depends, in general, on ¢; the smaller « is chosen, 
the greater is N. For this reason it is denoted more precisely by N(e). For a sequence {a,} converging 
to the limit a, to every ¢ > 0 there corresponds, of course, a number N2(€) beyond which |a,—a| < ¢/2, 
a number N,(e) beyond which |a, — a| < é/k, a number N ‘(e) beyond which |a, — a| < e*, and 
so on. Abundant use will be made of this. If the sequence {a,} converges to the limit a, one writes 
{a,} + a as n— OO, or lim nan = a (read a, converges to a as n tends to infinity, or the limit Of an, 


as n tends to infinity, : a). Pictured ee ometncally, this means that only finitely many terms of the 
sequence lie outside the e-neighbourhood a — «--- a+ e of the limit a, whilst all other terms lie 
within this e-neighbourhood. Thus, one says that almost all the terms ‘of the sequence lie in the 
e-neighbourhood of a, no matter how small ¢ may be. 


Example 1: The sequence 0.3, 0.33, 0.333, ... with the defining law a, = 3/10+-3/10?7+ ---+3/10" 
converges to 1/3. For the magnitudes of the deviations from the limit one finds 
la, — sl = |3(10"-* -+- «-» + 10 + 1)/10" — 1/3 
= |[9(10" — 1)/(10 — 1) — 10"]/G- 10”)| = 1/(3- 10") <e. 
Bors an arbitrary given positive ¢, this inequality is satisfied for all n > N(e) = Ig [1/(3e)]. For 
e = 10-*?, for example, N(e) = 12 — lg 3; thus, in this case only 12 terms differ from 1/3 by more 
thane = 10-12. In general, every infinite decimal fraction 0. Z,Z2Z3 --. With digits z, can be regarded 


18.1. Sequences 385 


as a convergent sequence {a,} with a, = z,/10 + z,/10?--- + z,/10". The limit of the sequence 
is the real number represented by the decimal fraction, 

Example 2: The sequence 1, 1/4, 1/9, 1/16, ... of reciprocals of squares has the limit zero, because 
for arbitrary « > 0, |a, — 0| = |1/n? — 0| = 1/n? < e for all n > N(e) = 1/Ve. 


Sequences with the limit zero are called null sequences. From every null sequence {b,} a sequence 
{b, + 5} with the limit 5 can be constructed. Conversely, if the sequence {a,} converges to the limit a, 
then {a, — a} is a null sequence. 


Convergence behaviour of arithmetic and geometric sequences. Sequences that do not converge 
are called divergent. For example, every arithmetic sequence is divergent. Since the difference be- 
tween two consecutive terms is always d, it is never possible for almost all its terms to lie in a neigh- 
bourhood of a fixed value. For positive values of d the terms a, of the sequence are all eventually 
greater than every arbitrary large number. One therefore writes symbolically lima, = oo and 


n—> co 
calls such a sequence definitely divergent. For negative values of d the terms are eventually less than 
every negative number of arbitrarily large absolute value. This sequence is also definitely divergent; 
one writes lim a, = —oo. 


n—-0o 


The infinite geometric sequence with the defining law a, = a,q"~'! converges to zero if the absolute 
value \q\ of the common ratio is less than 1. If \q| is greater than I, then the sequence {a,} is divergent, 
in particular, definitely divergent for q > 1. 


Subsequences. If p;, P2, 23, --->Pn>--- iS any strictly monotonic increasing infinite sequence of 
natural numbers, then {p,} is called a subsequence of the sequence of natural numbers; for example, 
the sequence 1, 3, 7, 9, 13, 14, 27, ... If such a sequence {p,} of indices is chosen, this determines 
from any sequence {a,} one of its subsequences {a, }. For example, 1, 1/8, 1/64, ... is a subsequence 


of the sequence 1, 1/2, 1/4, 1/8, 1/16, ... If the terms a, for all n > N(e) lie in the e-neighbourhood 
of the limit a, so that |a, — a| < e, then the terms ap, of the subsequence with p, > N(e) also lie 
in this neighbourhood. Hence the following theorem holds. 


Every subsequence {ap,} of a convergent sequence {a,} + a converges to the same limit a. 


Theorems about convergent sequences. The convergence of the sequence {a,} is decided by the 
existence of a subscript N(e) beyond which |a, — a| < €; the size of N(e) is completely immaterial. 
For this reason finitely many terms can be removed or added without altering the convergence 
or the limit of the sequence, since this affects at most the size of N(e). Such properties, which 
depend only on the behaviour of all terms ‘beyond the place N(e)’, are called infinitary pro- 
perties of a sequence. The convergence of a sequence is an infinitary property. 


Convergent sequences are bounded. 


If a sequence {a,} has the limit a, then almost all its terms lie in the interval from a — € toa + ¢; 
but the set of terms that lie outside this interval is finite and therefore is also bounded. 

If the convergent sequence {a,} has the upper bound K, then its limit a is also not greater than K. 
Otherwise infinitely many terms of the sequence would have to fall in a neighbourhood of a lying 
entirely to the right of K, in contradiction to the bounding property of K. Similarly, the limit 
a of the sequence cannot be less than any of its lower bounds. 


A convergent sequence has exactly one limit. 


If {a,} has two different limits a and a’, then ¢ can be chosen so small that the e-neighbourhoods 
of a and a’ have no point in common. From some place N(e) onwards, infinitely many terms of the 
sequence lie outside the e-neighbourhood of a, and infinitely many lie outside the e-neighbourhood 
of a’, which contradicts the limit property of a and of a’. 


If the sequences (a,} and {b,} have the limits a and 5, then the sequences {a, + 5,}, (d, — 5,)}, 
{a,b,} converge to the limits a + 6, a — b, ab, respectively, and if 5, and 5 are different from zero, 
then {a,/b,,} converges to the limit a/b. 

Suppose, for example, that it is required to show for an arbitrarily prescribed e« that 
\(a, + 5,) — (a + 5)| < «. From the convergence of the sequence {a,}, an index N, can be deter- 
mined so that |a, — a| < «/2 if m > N,, and similarly, for the sequence {b,}, an index N2 so that 
b, — b| < «/2 ifn > N2. For all n > max (N,, N2), it follows from the triangle inequality that 
(a, + 5,)—(a+ 5)| = (an —a)+ (5, = b)| < lan = a| + lDn = b| < &, aS required. 

To show that |a,/b, — a/b| can be made smaller than any given positive number «, one first notes 
that |aq/b, — a/b| = |[b(a, — a) — a(b, — B)I(b- by)| < [15] - |an — a| + al « [bn — BIN/C19| « [dgl). 
N3 can be determined so that |b,| > g > 0 for all 2 > N3; this is always possible since b + 0. 
Finally N, can be determined so that |a, — a| < ge/2 for all n> N,, and N2 so that |b, — d| 


386 18. Sequences, series, limits 


< g|b| e/(2|a|) for all n > N2. It then follows that |a,/b, — a/b] < « when n > max (N,, N2, N3). 
The following statements are important special cases of the last theorem. 


1. If ce, c,, and cz, are constants, and {a,} —+ a, {b,} — b, then {ca,} — ca and {c,a, + C2b,} 
+ c,a + c2b. 

2. Since the sequence of the products of the terms of two convergent sequences converges to the 
product of their limits, it follows that {a*} — a* for every positive integer k whenever {a,}— a. If 
a, + 0 and a + 0, this also holds for every negative integer k. One can even deduce that {a%} — a* 
for every real number « if a, + 0, a + 0. 


3. If {a,) and {b,} are null sequences, so are the sequences {a, + by}, {@q” — by} and {ayb,}. 


The sequence {a,/b,} formed from the null sequences {a,} and {,} is not, in general, a null sequence. 
For example, {a,} = {1/2"} and {b,} = {1/4"} are null sequences, but {a,/b,} = {2"} is definitely 
divergent. 


If the sequences {a;} and ({a,’} converge to the same limit a, and the relation a, < a, < a,’ holds 
for almost all terms of the sequence {a,}, then {a,} also converges to the limit a. 


Corresponding to an arbitrary « > 0 there exists an N(e) beyond which all terms of the sequence 
{a,} and all terms of the sequence {a,’} lie in the e-neighbourhood of a. Since a, <a, < a’’, almost 


all terms of the sequence {a,} also lie in this neighbourhood, so that lim a, = a. 
i a Be, 2) 
Limits of some important convergent sequences 


n 
1. For arbitrary positive values of q, {x,} = {~q — 1} is a null sequence. For gq = 1 every term 


has the value zero. For g > 1, vq > 1, so that the numbers x, are positive. Hence g = (1 + x,)" 
> 1+ 1xX_ > nx, > 0, or 0 < X_ < q/n. But {q/n} is a null sequence, so that {x,} is a null sequence. 


For g< 1, 1/¢g > 1 and hence (Vd/q) — 1} has the limit zero. If this null sequence is multiplied 
n 
term-by-term by the sequence {j/q}, which is bounded since Vq < 1, then the product sequence 
n 


n 
{1 — Yq}, and hence the sequence {j/q — 1}, is a null sequence. 

2. As was shown above, the terms of the sequence {q‘/"} have the limit 1 as n tends to infinity, 
where q is an anbitrary positive number. Hence a number N can be found, such that for all m > N 
both values g+'/" lie between 1 — € and 1 + «. For a null sequence {a,}, an index N, can always 
be found so that a, lies between —1/m and +-1/m for all n > N,, and thus for all n > N, the powers 
q°n lie between 1 — « and 1 + «. Hence g*n — 1 lies between —e and +e, that is, {q¢n — 1} is a 
null sequence if {a,} is a null sequence. From this it follows that {g°"} converges to the limit q* if 
{a,} — a. For q*n — q* = q%(q*n-* — 1), where {a, — a} is a null sequence and hence {q@n-* — 1} 
is also one. | oa 

It will be shown below under 4. that for an arbitrary basis b > 1 of a system of logarithms, 


{log, @n} —> log, a if {a,} > a. If « is an arbitrary real constant, then {« log, a,} > « log, a and 
also, as has just been shown, {b% !°8o ¢n} —- 5% 18 4, hence {a%} > a*. 
n 


n 
3. The sequence //n converges to 1, that is, {x,} = {Vm — 1} is a null sequence. For n > 2 its 
terms x, are positive. From (1 -++ x,)" = 7 one obtains from the binomial theorem n(m— 1) x2/2< n, 
or |x,| < V[2/(m — 1)]. If corresponding to the prescribed number ¢ > 0, the number N(e) = 2/e? + 1 
is chosen, then |x,| < «¢ for all n > N(e). 
4. For an arbitrary basis b > 1 of logarithms, {(log m)/n} is a null sequence; in other words, 
for an arbitrary ¢ there must exist a number N(e) such that (log )/n < « for all n > N(e). But 


(log n)/n < ¢ =» logan < ne <n <= Dn yn < 5, 


Since b® is greater than 1, Yn converges to 1, and the above argument is reversible, the result 
follows. Moreover, since log, ,, 1 = —log, n, the sequence {(log )/n} also converges to zero for 


0<b<1. 


Convergence criteria for sequences. From the definition of convergence one can test whether a 
number a is, in fact, the limit of the sequence {a,}. On the other hand, if no such number a is known, 
one uses convergence criteria (or tests for convergence) that allow one to determine the convergence 


18.1. Sequences 387 


or divergence of a sequence from properties that are, in general, easily verified. However, there is 
no general method for the determination of the limit; this can be found only by a variety of methods, 
specially constructed for particular sequences. 

The first test for convergence. The terms of a monotonic increasing, unbounded sequence assume 
arbitrarily large values; the sequence is definitely divergent. But if a monotonic sequence is bounded, 
it can be shown to have a limit. 


The first test for convergence: A monotonic and bounded sequence is always convergent. 


The number e as a limit. The sequence {a,} with a, = (1 -++ 1/n)" increases monotonically, since 
forn>2 
Any =[1+1fa—1)y t= ee a =[n/(a— 1)!"- 1 — 1/2) < [nf(a— 1)" — 1/n?)" 
= [2 + 1)/n)" = (1 + 1/n)" 
The inequality results for a = —1/n? from ca use Of Bernoulli’s inequality 1+ na< (1+ a)", 
which holds for a > —1, a +0,n > 2. The sequence {(1 -++ 1/n)"} is bounded. Since all the terms 
are positive, zero is a lower bound. The binomial theorem gives 


a, =(1 + 1/ny"=14 (1) [at i (1) [= ae (1) [om 


One can obtain an estimate for each term in this sum by 


Cz) |= = (1/k!) 1 — 1/n) d — 2/n) + [1 — (& — D/n) < 1/22 +3 k) < 12-1, 


k 
so that 
ay, = (1 + I/n)? <1 +14 1/2 + 1/27 +--+ 1/277? << 14 1/0 — 1/2) =3. 
Thus, the sequence is also bounded above. It is therefore convergent, and following Leonhard EULER 
its limit is denoted by e. The number e is sandwiched between the terms of the sequence considered 
and those of the monotonic decreasing sequence {[1 + 1/(n — 1)]"}, which likewise converges 
to e. However, e is usually calculated by means of the series 


The second (or Cauchy) test for convergence. While the first test for convergence applies only to 
monotonic sequences, the Cauchy test holds for arbitrary sequences. If the differences of all pos- 
sible pairs of terms from some place N(e) onwards are less than a given positive number «, then 
almost all terms of the sequence lie in an e-neighbourhood. At most the finitely many terms a, with 
i < N(e) can lie outside it. The expression ‘if and only if’ in the following indicates that the test is 
both necessary and sufficient. 


The second test for convergence: A sequence {a,} is convergent if and only if corresponding 
every arbitrary positive number «, a number V(e) can always be chosen so that |a, — a,,| af e€ 
for all indices m and m greater than N(e). 


Example 1: The sequence 1/2, 5/4, 5/6, 9/8, 9/10, 13/12, 13/14, ... with the general term 
a, = 1 + (—1)"/(2n) is bounded, but not monotonic. The first convergence test is not applicable. 
The Cauchy test establishes the convergence of the sequence, for 

lany1 — ap = i! + (—1)"*"/(2n + 2) — 1 — (—1)"/(2n)| 

= |{((—1)* — (—1)" (2m + 2))/[2n(2n + 2)]| < |[2n 4+ 2n + 2]/[2n(2n + 2))]| 

= |[4n +- shee + 4n)| < [4n + 4]/[4n? + 4n] = 1/n < e for all n > 1/e. 
As is obvious from the defining law, all the elements following a,,, lie between a, and a,.,; 
thus, for arbitrary n,m > I/e, |dg — @_| < |@ng, — Gn| < €. 

Example 2: The sequence 1, 1 + 1/2, 1 + 1/2 + 1/3, 1 + 1/2 + 1/3 + 1/4, ... with the general 
term a, = 1 + 1/2 + 1/3 + --- + 1/n does not satisfy the Cauchy convergence test, since if one 
chooses an e < 1/2, there always exist two numbers nm, m > N(e), for which |a, — a,,| > e, no 
matter how large N(e) is. Suppose that m > N and nm = 2m > N. Then one obtains 

lan — Aq | = 1/(m + 1) + 1/(m + 2) + 1/(m + 3) + --- + 1/2m) 
> 1/Qm) + 1/2m) + +» + 1/2m) = m: 1/2m) = 1/2>, 
where each of the fractions 1/(m-+ i) (i= 1,2,...,) has been replaced by the smaller, or at 
most equally large, fraction 1/(2m). 
Accumulation point of a sequence. The sequence 1 + 1/2, 2 + 1/2,3 + 1/2, 1+ 1/3,2 + 1/3,3 + 1/3, 
-»1+ 1/n, 2+ 1/n, 3+ 1/n, ... has the property that infinitely many terms of the s:qu2nce lie in 
every neighbourhood of each of the numbers 1, 2, and 3. The terms of the sequence accumulate 


388 18. Sequences, series, limits 


in the neighbourhood of the points 1, 2, and 3, which are therefore called accumulation points of 
the sequence. 


A number A is called an accumulation point of the sequence {a,) if for every arbitrary positive 
number ¢ the inequality |¢, — A| < e is satisfied for infinitely many distinct terms a,,. 


From this it follows that the limit of a sequence is always one of its accumulation points. On 
the other hand, an accumulation point is not necessarily a limit, because for an accumulation point 
A the inequality |a, — A| < « has only to be satisfied for infinitely many n, but for a limit A it must 
be satisfied for all n from a particular place N(e) onwards. Consequently, a convergent sequence 
can have only one accumulation point, since only finitely many terms of the sequence lie outside 
each e-neighbourhood of the limit Z, and in particular, it is impossible for infinitely many terms 


to lie in every e-neighbourhood of L’ + L. The following theorem shows that the converse of this 
statement is also true. 


A bounded sequence with exactly one finite accumulation point is convergent. But if a sequence 
has no finite accumulation point, or more than one, then it is divergent. 

The Bolzano-Weierstrass theorem: Every bounded infinite sequence has at least one accumulation 
point. 

If k is a lower bound and K an upper bound of the sequence, then all its terms lie in the interval 
Jo = [k, K]. This interval is bisected and the half in which infinitely many terms of the sequence 
lie (or the left-hand half if both contain infinitely many) is denoted by J,. The same procedure of 
bisecting and choosing one half interval applied to the interval J, gives J2, and so on. The nest 
of intervals so constructed contains exactly one real number A, which is an accumulation point of 
the sequence. For since the lengths of the intervals of the nested set converge to zero, corresponding 
to every e-neighbourhood of A there is an interval of the set that lies entirely in this neighbourhood 
and, by construction, contains in addition infinitely many terms of the sequence. 

The concept of the accumulation point can be extended to arbitrary sets of numbers; the Bolzano- 
Weierstrass theorem ensures the existence of at least one accumulation point for bounded infinite 
sets of numbers. This need not itself be an element of the set; for example, the accumulation points 1, 
2, 3 in the example considered above do not belong to the set. 


18.2. Series 


Series are of special significance, as much for the inner structure of mathematics as for practical 
applications. Many numerical methods are based on the theory of series; for example, the construc- 
tion of tables of logarithms and of trigonometric functions and the calculation of important con- 
stants such as e and 2 are best accomplished with the help of series (see Chapter 21.). 


The concept of a series. The Greek sophist ZENON (Sth cent. B. C.) posed the question whether 
Achilles, who runs twelve times as fast as a tortoise, can overtake it if he gives it a start of 1 stadion 
(an ancient measure of length, approximately 200 yards). While the tortoise crawls a distance of 
1/11 stadion, Achilles with twelve times the speed covers 12- 1/11 = 1+ 1/11 stadion, that is, 
the head start and the path of the tortoise; thus, he has overtaken it. On the other hand Zenon 
argued: when Achilles has covered 1 stadion, the tortoise has crawled 1/12 stadion; when he has 
run this twelfth, the tortoise still has a lead of 1/12? stadion; when he has covered this, the tortoise 
is still 1/12° stadion ahead, and so on. The distance covered by Achilles to the point of overtaking 
can therefore be expressed in the form 1 + 1/12 + 1/12? + 1/123 +.---, where the dots denote 
that every term a, = 1/12*—! is followed by another one a,,, = 1/12*, so that the expression does 
not terminate. Such an expression is called an infinite series. ZENON believed that in this example 
he had found a contradiction in the formal thinking, since it seemed certain to him that the value 
of the infinite series is greater than every quantity so that Achilles could never catch up with the 
tortoise. However, the series correctly set up by him is geometric and has the sum 12/11, as follows 
from the rules derived for these series. 


By an infinite series (or just a series) one understands an expression of the forma, +- az + a3 + **", 


tm 
abbreviated to 5 a,, where the a, are terms of an infinite number sequence {a,}. 
i=] 


Use of the summation sign. To write a sum in an abbreviated form one uses the Greek letter 2, 
and writes, for example, b, + b2 + 63 +:--- + b, = D 5; (read sum of 5; for i equals 1 to n). 
i=1 


The addition to the sign >” of the condition ‘i equals 1 to n’ implies that the terms of the sum are 
given by letting the summation index i assume in succession the values of all the natural numbers 


18.2. Series 389 


5 
from 1 to n. For example, » 1/i? = 1/17 + 1/2? + 1/3? + 1/42 + 1/5. The sign is also used to 
i=l 
write infinite series in abbreviated form. For example, the series obtained by ZENON is written 
ioe) 
1+ 1/12 + 1/127 +--- = ¥'1/12'. The symbol co means that the series does not terminate. 
i=0 


The summation index no longer occurs when the series is written as the sum of its terms, and it is 
therefore immaterial whether it is denoted by i, k, or any other letter. It is often convenient to give 


[ee] 
the first term of a series the subscript 0, so that the series has the form  a,. 


n=0 
Convergence and divergence. Sum of a series. As has already been mentioned, one can associate 


the value 12/11 with the series * 1/12‘. In order to be able to decide, in general, whether a value 
i=0 


le @) 
can be associated with a series »' a;, one forms from the terms a; of the series the sequence {s,} 
ie | 
of its partial sums. If and only if this sequence of partial sums converges, to the limit S say, will a 
value be ascribed to the series, namely the value S. One says: the series converges and has the sum S. 
oo ; = 
An infinite series 3; a, is said to be conver- 
i=] 


gent if and only if the sequence of its partial 
sums converges. The limit S of the sequence of 
partial sums is called the sum of the series 


S=a,+a,+a,;+.: or S= 5a. 
=1 


On the other hand, if the sequence of partial 
sums of the given series diverges, the series is 
said to be divergent; it has no sum. 


The word sum of a series is chosen only on the grounds of the formal analogy with sums of finitely 
many terms and is simply a synonym for the concept of limit of the sequence of partial sums. In 
ZENON’S series the general term s,, of its sequence of partial sums has the value s, = (12/11) (1 — 1/12") 
(see Sum of a finite geometric series), and this converges to the limit 
S = lim s, = lim (12/11) (@ — 1/12") = 12/11. 
n—» CO n—» CO 
Example: If a square of unit area is repeatedly halved, as indicated in the 
accompanying figure, then the area of the resulting rectangles can be consid- 
ered as terms of the infinite series 1/2 + 1/4 + 1/8 + --- + 1/2" +--+. 
The geometrical aspect leads one to suppose that the series has the sum 
1. Since the sequence of partial sums of the given series is 1/2, 3/4, 7/8, ..., 
(2" — 1)/2", ..., it does indeed converge to the limit 
s= lim s, = lim (2" — 1)/2" = lim (1 — 1/2") = 1. 


A= oO A= oO A= oO 


Even in the 18th century these concepts had not been clarified. For 18.2-1 The conver- 
example, the infinite series 1— 1+ 1-—1+1... was written either gence of the series 
ag—H+da—-)HN+d-—-1)4+-. or 1—QA—1)—(d—1)—--- with the 1/2+1/4+ 1/8+- 
corresponding sums 0 or 1. But the sequence of its partial sums s, = 1, 

52 = 0, 53 = 1, s, = O, ... diverges, and the series has no sum. 


[o @) 
Arithmetic series. In an arithmetic series >; a, the a; are the terms of an arithmetic sequence 
i=1 n 
{a,}. Clearly every infinite arithmetic series is divergent. Only its nth partial sum s, = J a,, often 
i=] 


called a finite arithmetic series, is of any interest. When Carl Friedrich Gauss was nine years 
old, his school teacher gave to the class the task of adding together all the whole numbers 
from 1 to 100. He had hardly sat down, however, when little Gauss put his slate down on the 
desk with the words: ‘There it is’. The teacher was all the more astonished when all the slates 
were finally given in to find on the first one only one number, just the right answer 5050, which 
Gauss had calculated in his head by means of the scheme indicated below. 

The teacher recognized that little Carl Friedrich could not learn much in his arithmetic class, and 
procured a special arithmetic book for him from Hamburg: Remer’s Arithmetica. The idea of the 
nine-year-old Gauss can be applied to every finite arithmetic series. 


390 18. Sequences, series, limits 


Sn = 4, + (Q, + d) + (ay Pid) PE as 1) d) 1+ 2+ sey eecas 
5, = a, + (a, — d) + (a, — 2d) ++: + G@, — [nm — 1) d) 100 + 99 + 98 +--.. + 5] 
2s, = ma, +4,) OF 5S, = ma, + a,)[2. 50 x 101 = 5050 


From the formulae for the sum it can be seen that three of the numbers a,, a,, d, n and s, must 
be given; the remaining ones can then be calculated from linear or quadratic equations. 


Example |; If for a finite arithmetic series a; = 3, a4, = 43 and d = 5 are known, then the num- 

ber of terms m and the sum s, can be calculated. 
Ua oS ok Caco 5—— n= 9; 
= n(a, + a,)/2 Sg = 23 + 43)/2 = 207. 

ceo: From d= 12, s, = 180, and a, = 60 the first term a, and the number of terms n 
can be found ot ea ; 
a, = a, — (n — 1) d; Sy = (n/2) a, + a,)—e Sp = (n/2) [2a, — (n — 1) d]—e 

180 = (n/2) [120 — (n — 1): 12) — vn? — 11n + 30=0. 

This quadratic equation has the solutions nm, = 6, mn, = 5. From nm, = 6 it follows that a, = 0 
and'from mn; = 5 that a, = 12. The finite series 12 + 24 +. 36 + 48 +- 60 and 0 + 12 +- 24 + 36 
+ 48 + 60 both correspond to the given values. 


[ee] 
Geometric series. In a geometric series 3; a,, the a; are terms of a geometric sequence {a,}. The 
nth term s, of the sequence of partial ‘=! 
sums, often denoted also as the sum of Sn a, + ayq+ayq? ++: + a,q""! 
a Sn ayg + aig? +--+ +ayq""* + aq" 
Sn — Sn = 4, — 4,Q", OF S,=a,(1 — g”)/((1 — q). 


the finite geometric series D; a;,, is ob- 
i=l 


tained by means of the adjacent scheme: 


From the formula for the sum it can be seen that three of the numbers a,, a,, g, n and s, must be 
given and the remaining ones can then be calculated. However, in the process exponential equations 
or equations of the mth degree can occur. 


Example 1: If in a geometric series the first term a, = 2, the common ratio g = 5, and the 

sum s, = 976 562 are given, then the number of terms m can be calculated 
Sn = a,(g" — 1)(q — 1)—e 976562 = 2(5" — 1)/(5S — 1)—e 5" = 1953125. 
This exponential equation has the solution m = 9. The series has nine terms. 

Example 2: According to the Arabian historian Ja’qubi, the inventor of the game of chess 
asked the Shah of Persia as a reward for the number of grains of wheat that would result from 
placing | grain on the first of the 64 squares of the chess board, 2 on the second, 4 on the third, 
and so on, placing on each square twice as many as on the previous one. The total number of 
grains is given by the formula 

Sp = a,(q"—1)(q—1), sothat seq = 1(2%* — 1)/(2 — 1) = 2°* — 1 & 1.84 x 10°°. 
Assuming that the surface of the earth (roughly 13 = 10'° acres) forms a single wheatfield with 
a yield of 1.6 tons per acre, and assuming 20 million grains to the ton, then four harvests would 
still not be enough to yield the required quantity. 


Infinite geometric series. The sum s, = a,(1 — q")/(1 — q) for gq + 1 is the mth term of the sequence 
of partial sums. The numbers a, and q are constants and the convergence of the sequence depends 
only on the magnitude of (1 — q"). For q > 1 and for q < —1 the sequence {q"} is divergent, so 
that the geometric series also has no sum. For |q| < 1, {q"} is a null sequence, so that {1 — g"} has 
the limit 1. In this case the geometric series converges and has the sum s = a,/(1 — q). 


Example |: Every periodic deci- 
mal fraction can clearly be repre- 
sented by a convergent geometric 
series. The formula for the sum makes it possible to transform the decimal fraction into a vulgar frac- 
tion. For example, the decimal fraction 0.2525... corresponds to the series 25/100 + 25/10000 +- --- 
with the first term a, = 25/100 and the common ratio g = 1/100; it has the sum s = 25/99. 


18.2. Series 391 


Example 2: Six lines can be drawn through a point O in such a way that each pair of neigh- 
bouring lines includes an angle « = 30° 
2/6. From the point P; on one of the 
lines at a distance a from the point O, the 
perpendicular to a neighbouring line is 
drawn. From its foot P, the perpendicular 
to the following line is drawn, and soon (Fig.). 
The consecutive perpendiculars form a 
polygonal arc |P,P2| + |P2P3| + --- that 
spirals around and shrinks towards the 
point O. The perpendicular |P,P;,,| has 
the length /,, where /, = asin (x/6), 
l, = a sin (2/6) cos (2/6), 
l; = a sin (2/6)[cos (2/6)]?, ... 
The series /, + /, +1, +++ is geometric 
with a, = asin (2/6) and g = cos (2/6); it 
converges since cos (7/6) < 1 and has the 
sum 


/ s = asin (2/6)/[1 — cos (2/6)] 
18.2-2 The sum of a geometric series = (a/2)/(1 — */273) = a(2 + V3). 


Convergence tests for series with positive terms. Here again, as for sequences, the questions whether 
a given series converges, and if so what is its sum, play a particularly important role. Theorems by 
means of which the convergence behaviour of a series can be decided are called convergence criteria, 
or convergence tests. One distinguishes between necessary conditions, sufficient conditions, and 
those that are both necessary and sufficient. A necessary condition leads to several possibilities. 
Series that do not satisfy it are certainly divergent. On the other hand, if the condition is satisfied, 
then the series can, but need not converge. A series certainly converges if it satisfies a sufficient 
condition; but if it fails to satisfy this, it may nevertheless converge. Definite conclusions can be drawn 
only if a sufficient condition is satisfied, or a necessary condition is not satisfied. Consequently 
the most useful criteria are those that are necessary and suf ficient, because they allow one to distinguish 
at once between convergence and divergence. 

A series converges if the sequence of its partial sums converges, in other words, if all its partial 
sums from some index mo onwards lie in an e-neighbourhood of the limit s. But the partial sum s,,, 
arises from s, by the addition of a,,,;. A necessary condition for the convergence of the sequence 
{s,} of partial sums is therefore that the sequence {a,} of the terms of the series is a null sequence, 


For the series By a, to converge it is necessary, but not in general sufficient, that its terms form 
a null sequence, or ‘that lim a, = 0. 

The first main test for convergence: For the series es a, of positive terms to converge it is necessary 
and sufficient that the sequence of its partial sums is bosaded. 


Since the series has only positive terms, the sequence of partial sums increases monotonically. 
If it is also bounded, then it must be convergent, by the first convergence test for sequences. Since 
this criterion does not require strict monotonic behaviour, the statement also holds for series with 
non-negative terms. 


Example I: The terms of the harmonic series Ps 1/fi= 1+ 1/2+ 1/3 + 1/4+--- form a null 


sequence; but the series itself diverges, since the jequence {s,) of its partial sums is not bounded. 
For s, exceeds every number C when n > 2” and m > 2C 


Su > (1 + 1/2) + (1/3 + 1/4) + (A/S + + + 1/8) +o + C+ - + 1/2™) > 1/2 + 2° (1/4) 
Aer (1/8) ++ — 41/2") = m/2; 5, > C. 


Example 2: The series Px 1/[((n — 1) ne] = 1/11 +2) + 12-3) + 13-4) +-- has the ath 
pi sum 

= 1/((1+ 2) + 1/(2+3) ++ + 1/{n(a + 1)) 

= (1 — 1/2) + (1/2 — 1/3) + (1/3 — 1/4) + «+ + [fn — Ifa + DY) = 1— 1ffn+ I). 
Since {1/(m + 1)} is a null sequence, the sequence {s,} = {1 — 1/(m + 1)} is bounded. The given 
series converges and has the sum I. 


392 18. Sequences, series, limits 


Comparison test. A series whose terms are not smaller than those of a given series with positive 
terms is said to dominate or majorize the given series. If it converges, then by the first test for con- 
vergence its sequence of partial sums is bounded. The fact that it dominates the given series implies 
that the sequence of partial sums of that series is also bounded, and thus also converges. In exactly 
the same way one can conclude that a given series with positive terms diverges if there is a correspond- 
ing divergent comparison series whose terms are not greater than those of the given series. Such a 
comparison series is said to be subordinate to the given series. 


It is sufficient for the convergence of a series that it is dominated by a convergent series; it is 
sufficient for the divergence of a series that it dominates a divergent series. 


In order to be able to use comparison tests, one must have available a sufficiently large supply of 
known convergent or r divergent series. Series of the form Py 1/n* can often be used as comparison 
series. The series Z 1/n? converges, since 1/n? = 1/(n- ) < 1/[(# — 1) ny), and it is therefore 
dominated by the convenient series Py 1/[(m — 1) a] wre Example 2). For « > 2, Py lie converges, 
since 1/n* < 1/n*. Because Ins "Vn, the series 2 1/~n=14+ 1/f2+ 1/3 Bass - dominates 
the divergent harmonic series Py 1/n, and therefore. divecacs, Since 1/n < 1/n* for all « < 1, the 
series Py 1/n* diverges for every — < 1. For « > 2 the convergence of the series has already been 


established. For 1 < « < 2 a dominating geometric series can be found. If k is an integer such 
that 2* > n, then 


Sa < Su. = 14+ (1/274 1/3) + (1/47 + ee 1/6* + 1/7%) ++ 
YA)? bee fahed 
< 1 + 2/2* + 4/4* + cee oft oe 1(Qk-1)a 
= 14 1/201 + 1/281)? 4 + 1faetyet, 


This is a geometric series with common ratio g = 1/2*—!, which is less than 1 for 1 << « < 2. Thus, 
the geometric series converges, and therefore the given series also converges. 


The ratio test: If Ya; isa series of positive terms and if there exists a positive number q less than 1 


such that @,,:/4, < ¢ from an tadex iio onwards, then the series converges. On the other hand, if 
from an mo onwards a,,,/a, > 1, Sheu tha eatin diverse, 


By hypothesis: 

Gngtt/Fng SF —PAns1 S JOng» 

Bnet 2/Fneet =F —POms2 =F Imei = F*4n, Cte. 
It follows that the series formed by the terms of the remainder of the given series from the moth 
term onwards is dominated by the geometric Series Zang But the convergence of this remainder 


series decides the convergence of the given series, which therefore converges for g < 1. On the other 
hand, if a,4,1/a@, > 1 it follows that a,,; = a, > 0; the terms of the series do not form a null sequence, 
and the series must diverge. The condition is sufficient. If it is not satisfied, nothing can be concluded 
about the convergence or divergence of the series. This is the case, for example, if the quotient is 
less than 1, but not less than a fixed number gq less than 1. The ratio of two successive terms of the 
harmonic series, known to be divergent, is a,,;/a, = /(2 + 1) <1, but it is not true that 


fee) 
n/(n + 1)<q<.1. For the convergent series 3’ 1/n?, however, @n,1/@, = (m/(m + 1))? < 1, but 
| 
again it is not true that (n/(n + 1))? < q < 1. In both cases lim a,,;/a, = 1. 
R= 0O 


| The ratio test fails if the 
sequence {d,,;/a,} tends to 
1 ‘from the left’. 


18.2. Series 393 


The root test: If 5 a, is a series of positive terms, and if there exists a positive number ¢ < 1 
such that //a, <4 from some index mp onwards, then the series converges; on the other hand, if 
ya, > 1 from some index mp onwards, then the series diverges. 


Example: The series E n/n = 1!/1 + 2!/27 + 31/37 + 41/4* + 


converges, since a,=n /n", Qn, = (n+ 1)!/(n + 1)"** gives a,,;/a, 
= [@ + 1! ai + Itt? at) = [+ 1) w/t + 1**?) 
= [n/(n +- 1))" = 1/(1 + 1/n)" < 1/2 < 1 for all vn. 


n [e @) 
If Vans <q<_1 for all n > no, then is convergent geometric series so a dominates the remainder 


Dy a, of the given series. If however a, > 1, the terms of the series s do not form a null sequence. 
n= No 


The condition is sufficient. It fails, 


for example, if Va tends to 1 ‘from 
the left’. 


Example I: The series = a" /_" converges for every fixed « > 0. y(ox"/n") = a/n, which is less 
than 1/2 for all mn > 2«. By the root test Dy («"/n") is convergent. 

Example 2: For the series > (1 — 1/n)" ithe root test fails, since ‘lim n(l — I/n) = 1. However, 
one can show that lima, = ai (1 — 1/n)" = I/e. Thus the terms a, of the series do not form a 


A= oo A= oo 


null sequence and the series diverges. 


Convergence tests for series with arbitrary terms. Applying the second, or Cauchy, convergence 
test for sequences to the sequence {s,} of partial sums of the series 5” a,, one obtains the second 
test for convergence of series. 


The second main test for convergence: The ories. z a, converges if and only if to every arbitrary 


positive number e there pisrencaregp an integer Ne) such that 

Snap — Sn] = |@naa + G@ng2 +o: + Qnap| < © for all nm > N(e) and all p > 1. 

Of course, this test is not easy to handle. It is simpler to prove the convergence of 3” a, by estab- 
lishing the convergence of »’ |a,|, which dominates it. On the other hand, if the dominating series 
diverges, one can draw no conclusion about the given series. In spite of this the following statements 
about divergence are correct, because the terms of series that satisfy one of these conditions do not 
form a null sequence, and therefore violate a necessary condition for convergence. 


A series 3 ay with arbitrary terms converges if it eeaner sae a Be Temweny conenie: 
l@ns1/@nl < q c 1 for all x > mo; lim l@ns1/@n| <1; V\a,| <q< l1foralla> no; lim V/|ay “<r 
On the other hand, the series diverges if it satisfies one of the following conditions: % 


\@ns1/@nl =1 forall a> Mo; Ja! = | 
for all n>; lim ja,,,/a,| > 1; lim ya, > 


Leibniz’ test for convergence: An alternating series converges if the absolute values of its terms 
form a monotonic null sequence. 


If the alternating series is written in the form a, — a2 + a3 — a4..., the a; denote the absolute 
values of the terms. The subsequence s2, 54, 56, ---» San, --- Of its partial sums is monotonic increasing, 
SINCE $2442 = S2k + Gru. — 42442) = = S2,- 

The expression in the bracket is non-negative, since the sequence a, of absolute values of the terms 
decreases monotonically. For the same reason it follows from 


San = @, — (2 — a3) — (G4 — as) — +> — (@an_2 — G2n_1) — Gan 
that s., << a,. AS a monotonic increasing and bounded sequence {s2,} has a limit s. It then follows 
that the subsequence s,, 53, 55, ---» S2n41, --- has the same limit. For lim s2,,, = lim (S2n+ angi) =5; 
n— © n— co 


394 18. Sequences, series, limits 


because lim a2,,; = 0. Thus, the sequence {s,} of partial sums converges. The arguments also show 
n—- co 
that the sequence {|a,|} need not be strictly decreasing. 


Example 1: The series 5° (—1)"-1/n = | — 1/2 + 1/3 — 1/4 + --- is convergent (its sum is In 2). 
| 


Example 2: The alternating series 1 — 1/5 + 1/2 — 1/5? + 1/3 — 1/53 + 1/4 —--» diverges. 
The absolute values of its terms form a null sequence, but this sequence is not ioubtonic. The 
(2n)th partial sum can be arranged as follows: 


San = (1 + 1/2 + 1/3 +--+ + Vn) — (A/S + 1/5? + + + 1/5"). 

As m increases, the first part exceeds every finite number, while the second part, as a geometric 

series, tends to a finite limit. 

Calculations with convergent series. The rules that are valid for calculations with finite sums can 
be applied only partially to convergent infinite series (see the following Theorems 1, 2, 3); some 
can be applied only to series that satisfy stronger convergence conditions. 


I. The convergence and the limit of a convergent series are not altered if the terms are bracketed 
together TES but without altering their order. 


If S= 2 y= a, +a2+a3+---, then it is alsotruethat S= )’ A;, where A; = (a, + a2+-:-+4,,), 
1 


{= 
A, = Ga. + 4,42 +++ + 4,,), A3 = (4,41 + °° +4,,), -.. This follows because the sequence 
{s,} of partial sums of the series >” A; is a subsequence of the sequence {s,} of partial sums of the 
series »' a,;, and this is known to have the same limit as {s,}. 


2. If every term a, of a series converging to the sum s is multiplied by a constant c, the resulting 
series converges to the sum cs. 


By a theorem on sequences, {s,}— s implies 
that c{s,} + cs. 


2: Term-by-term addition of the convergent series = a,=—A and 2 b, = B gives the series 


z (a, + 6,), which also converges and has the sum A rs B. 


If {A,} and {B,} are the sequences of partial sums of the series )” a; and D 5;, then by a theorem 
on sequences, {A,} — A and {B,} — B implies that {A, + B,} + A+ B. 


Absolute CORTEEMET: A series = a, with arbitrary terms is said to be absolutely convergent if 
the series = |a,| of their absolute values converges. 


The series 1 — 1/2 + 1/3 — 1/4 + --- is not absolutely convergent, since the series of absolute 
values is the divergent harmonic series. If the series 3 a; and > |a,;| have the sums s and S, then 
since |s,| = |a; + a2 +++ + aq] < |a,| + |a2| +--+ + lanl <S, it clearly follows that |s| < S. 

Rearrangement of series. The question now _arises to what extent is the commutative law for 


finite sums also valid for infinite series. Let x a, be a series and (k,,k.2,..., Kn, ---) a Sequence 
n=l 


ae natural numbers with the property that it contains every natural number exactly once. The series 


2 a, is then called a rearrangement of the series x a,. For example, the series 


n=] 
1—1/2+1/3—1/44+ 1/5—-+- and 14+ 1/3—1/2+1/54+ 1/7— 1/44 > 
are obtained from one another by rearrangement. Both converge, but to different sums s, and s2. 
This shows that one has to use care in rearranging series. 
1— 1/2+1/3—1/4+ 1/5— 1/6+ 1/7—-:: 
1 — 1/2 + 1/3 — (1/4 — 1/5) — (1/6 — 1/7) — -:- 
10/12 — (1/4 — 1/5) — (1/6 — 1/7) —---» < 10/12 


so =141/3-—1/24- 1/54 1/7—-1/4+-- 
= (1 + 1/3 — 1/2) + (1/5 + 1/7 — 1/4) + (1/9 + 1/11 — 1/6) + + 
= 5/6 + 13/140 + (1/9 + 1/11 — 1/6) + --- > 11/12, 
since the expression [1/(2n — 3) + 1/(2n — 1) — 1/n] = (4n — 3)/[(Qn — 3) (2n — 1) n] is always 
positive. 


Sy 


ow tl 


and 


18.2. Series 395 


Such convergent series whose sum depends on the ordering of the terms are called conditionally 
convergent. Series that remain convergent and have the same sum, no matter how they are rearranged, 
are called unconditionally convergent. The important question for calculation with series, how one 
can distinguish unconditionally convergent series from conditionally convergent ones, or in other 
words, how one recognizes whether one may disregard the ordering of the terms of a series or not, 
is answered in a surprisingly simple way by the following theorem: 


Every absolutely convergent series is also unconditionally convergent; every series that is convergent 
but not absolutely convergent is only conditionally convergent. 


The first part of the theorem is easily established. Firstly, if >° a, is an absolutely convergent 
series with non-negative terms and )' a,, is a series obtained from it by rearrangement, then the 


partial sums s, of the first series and s, of the rearranged series satisfy the inequality 
Sn = Q&, + Q&, + "+ aS Sa, +a, +++ ay—5y<s5s 


if N is chosen so large that all the &; (i = 1, 2, ..., 1) occur among the numbers 1, 2, .... NW. Thus, 
the sequence {s,} of the partial sums of the rearranged series )’ a, is bounded, and its convergence 


follows from the first main test for convergence. If now 3’ a, is an absolutely convergent series with 
arbitrary terms and 2’ a, is a rearrangement of it, then )’ |a,| is an absolutely convergent series 
with non-negative terms; the convergence of |, | follows, as has already been shown, and this 
implies the convergence of a A, - 


The rearrangement also does not influence the sum. Because )” a, is assumed to be absolutely 
convergent, corresponding to an arbitrary € > 0a number m can be chosen so that for all k > 1 


|@m.1| ale |Qm+2| eae |2m-+%| <é. 


If N is now chosen so large that all the indices 1, 2, ..., mm occur among the numbers k,, k2,..., ky, 
then the difference |s, — s,| for n > N contains only terms a; with i > m, and from this it follows 
that |s,; — s,| < « for all n > N. Consequently 


= lim s, = lim [s, + (s, — 5,)] = lims, + lim(s, — s,)=s+0=s. 
n— 0 n—- 00 n—> 0 n-> 0 
To prove the second part of the theorem one can show that from the convergent, but not absolutely 


convergent series »'a,, divergent series or convergent series with arbitrarily prescribed sums can 
be formed by suitable rearrangement. 
: Quy + 42 +7 + ay; +° Z1 


On the other hand, absolutely convergent series can be re- 421 ale 422 —e aa a1 aa 22 
arranged in an essentially more general sense, without affect- : 
ing their convergence or their sums. Let 3’ a, be an ab- ae ar 42 +. “+ a + 
solutely convergent series and the adjacent scheme denote : 
an infinite sequence of partial series of the given series »' a,, with ihe propery that every term of 
the series >" a, occurs in exactly one of the partial series (a,;, for example, is the ith term in the 


2k 


[e @) 
ss partial series). Then the series 3° z is obtained from the given absolutely convergent series 
k=1 


[ee] 
x a, by a aceranecment in an extended sense’. D* z, is likewise absolutely convergent and has the 
same sum s as the series Z an. 


=1 

Very remarkable is the fact that, under certain conditions, the converse of this theorem holds. 
That is, the terms of the absolutely convergent partial series can be put together in an arbitrary way 
to form a series, and all possible series resulting in this way are convergent and have the same sum. 
The information about this is given by the major rearrangement theorem going back to CAUCHY. 


Major.rearrangement theorem. Suppose that the adjacent array is a sequence of absolutely con- 
vergent series, that is, for A = 1,2, ..., each of 


i i i MN “= zy 

a2 +. a2; are + a2) + = 23 the series by la, | converges and has a sum denoted 
) If, in addition, converges, then the terms 

ay +. a2 rm ie 0 ae Aon a z5 by ox. Pas Tees, 


: * occurring one below another in the same column of 

the given array likewise form absolutely convergent series. If one writes ze a,; = s,, then the 

series z s; also converges absolutely, and = = Z z.- Thus, the series be the row sums and 
=] 


the octas of the column sums are both aheolutely convergent and have the same sum. 


396 18. Sequences, series, limits 


To prove this one puts together all the terms a,; occurring in the given array in any manner to 
form a sequence which one denotes by a, a2, ..., a,, -.. Then the series >’ a, converges absolutely, 
because if N is chosen so large that the terms a,, a2, ..., a, all occur in the first N rows of the array, 


Jay | + Jaz] ++ + lanl <0, + On + + oy. 


Because J’ ¢, is assumed to be convergent, the right-hand side is bounded, and it follows that the 
nth partial sum of the series X' |a,| on the nang side is also bounded, and consequently ¥ a, 


is absolutely convergent. one ‘column series 2 ay; = S;, as partial series of  a,, are also absolutely 
convergent, and |s;| = | > ayi|< 2 |ayi\. Fron this it follows that the mth partial sum of = S$; cer- 


tainly does not sresed: the sum "Of the series ¥, |a,|; this means, however, that >’ s, is absolutely 
convergent. Finally, the series >” s; and D z, have the same sum, since each sum is equal to the sum 
of >’ a,. Because of the absolute convergence of D’ a,, according to the second main test one can 
choose the index m so that for all k > 1, |@m41| + |@m42| + +++ + lamsx| < €. One now determines 
N so that the terms @,, @2, .-., @,, all occur in the first » rows of the above array. Denoting the nth 


partial sum of the series >” a, by o,, the difference | = 2% — Gn is less than ¢ for all n> N, since 


only terms +a, with r>m occur in this expression. "Thus, lim y Zz = limo, = s. An analogous 
noo k=1 n— 00 
calculation for the column series yields lim oy Ss; = limo, = s. 
noo i=l n—0o 


aati pension of series. If one multiplies every term of the series > a, by every term of the series 
z b;, one obtains the partial products indicated in the array below. Each row of the array contains 


infinitely many terms all having the same a, as a factor, and each column contains infinitely many 
terms all having the same 5, as a factor. The product of the two series is now defined to be the series 


foe) 
Cx, where c, is the sum of the partial products in the kth diagonal of the array as indicated. For 
k=1 


example, c; = a,b, ,c¢2 = a,b2 + a2b,,c3 = a,b3 + a2b2 + a3b,,...,c, =. d 4,b,,... These partial 
+jeak+ 1 
products can be found by a translation method, in which one series is written in reverse order and 


the other is written on a strip of paper which is moved along above the first one. The diagram 
shows the position for the third term of two product series cz = a,b3 + a,b, + a3). 


a, by -4,bz ayb3 a,b _ one 
2b, -dzbz _azb3 dab, 
a3b; 4362 a3b3 _azb, 


dab, adgbz adgby dab, 


av the series = a = A and z b, = B are both absolutely convergent, then the product series 


Ea = C, with “t z Pf ‘ts also absolutely convergent and has the sum C = AB, 
k = i+j=k+1 


The following example shows that the convergence of the two ‘factor series’ is not enough to 
ensure the convergence of the product series. 


Example: The square of the convergent, but not absolutely convergent, series 
1 — 1/y2 + 1/f3 — 1/4 +--+ is divergent, because its terms do not form a null sequence. The 
general term c, of the product of the series with itself satisfies 


lea| = = [1/Vn) + (1/2) - 1/V@ — 1)) + [1/3] - 11/@ — 2)) + + ar [1/Vnj-1 
> [1/V/n) - [1/V/n) + [1/y/n]° [1/V/n] + +: + O1/Vn) « [1/2] = 


18.3. Limit of a function — Continuity 


Limit and continuity of a function are concepts without which a rigorous construction of higher 
analysis is impossible. If a function describes a physical situation, then the concepts of limit and 
continuity often have a physical meaning also. 


18.3. Limit of a function—Continuity 397 


Limit of a function 


Limit at a point. The concept of the limit of a function y = f(x) can be related to the concept 
of the limit of a sequence. This is done by allowing the independent variable x to run through a 
convergent sequence of numbers {x,)} tending to the limit a (the abscissa sequence), and considering 
the ordinate sequence {f(x,)} of the values f(x,) of the function corresponding to the x,. If the 
convergence behaviour of the ordinate sequence { f(x,,)} depends upon the choice of abscissa sequence, 
that is, if two different abscissa sequences both converging to a have corresponding ordinate sequences 
converging to different limits, or if an ordinate sequence diverges, then the function f(x) does not 
tend to a limit as x tends to a. On the other hand, if the ordinate sequence {f(x,)} tends to L for 
every abscissa sequence {x,} tending to a, one says that the function f(x) has the limit L as x — a. 
This means that the values f(x) of the function come the nearer to the number L the nearer the ar- 
gument x comes to the value a. The difference | f(x) — L| be- 
tween the value of the function and the limit is less than every v¥ 
arbitrarily chosen positive number ¢, provided that the value 
of x differs from a by less than a suitably chosen number | 
6 = O(c) depending on «, that is, provided that 0 < |x — al Lte 
< dO(e) (Fig.). The number d(¢) is by no means uniquely —_; § | 
determined, for if one d(€) with the required properties has oes 
been found, then clearly every smaller number 0’ < d(¢€) will L-E 
also serve. 


The function f(x) has the limit L as x > a, lim f(x) = L, 


if to every e« > 0, however small, there corresponds a number 
d(e) > 0, such that the inequality | f(x) — L| < e holds for 
every x satisfying the condition 0 < |x — a| < d(e). 


pee ae 


a-o a ag+o at 


18.3-1 Geometrical illustration of 
the limit concept 


Example I: The function x* has the limit zero as the argument x tends to zero, lim x* = 0, 
dace [i — 0| < « for all x such that |x — 0| < d(e) < ye. x0 
Example 2: The function 1/x tends to the limit 1/a as x tends to a + 0, lim 1/x = 1/a. This 


follows from a theorem on number sequences, since for every abscissa sequence {x,} + a+ 0, 
the corresponding ordinate sequence {1/x,} — I/a. 

Example 3; From the existence of the value f(a) of the function one can certainly not conclude 
that the limit lim f(x) must also exist and be equal to f(a), though this is very often the case. The 


function f(x) = {+1 for x + 0, 0 for x = 0}, for example, has the limit 1 as x + 0, but the value 
of the function is f(0) = 0 (Fig.). 

Example 4: The function f(x) = (x? — 4)/(x — 2) is not defined for x = 2, since the numerator 
and denominator vanish simultaneously. But for x + 2, f(x) = x + 2, and so, x — 2, the function 
tends to the limit 4, lim [(x? — 4)/(x — 2)]=4; for |(x? — 4)/(x — 2) —4|=|x+2-4| 


x= 
= |x — 2| < e for all x for which |x — 2| < d(e) = «. 


18.3-2 Graph of the 
function /(x) 
+1 for x + 0 
Oforx=0 


18.3-3 Left-hand limit /- and right-hand limit /* as x +a 
are different 


One-sided limits. It may be important in the passage to the limit whether the independent variable 
approaches the value a in the sense of increasing values of x, that is, from the left, or in the sense 
of decreasing values of x, that is, from the right. One speaks of a left-hand limit /- if | f(x) — I-| <eé 
for all x with a — d(e) < x < a, and of a right-hand limit [+ if| f(x) — I+| < « for all x with 
a<x<a-+ &e). One writes lim f(x) =I, lim J (x) = /*, respectively, indicating symbolically 

x—a-0 x at 
by a — 0, a + 0 from which side x converges to a. These two limits /~— and /+ may differ from one 
another, for example, at a jump discontinuity of a function f(x) (Fig.). The function has no two- 
sided limit there. On the other hand, the function has a limit as x > a if and only if the left-hand 


398 18. Sequences, series, limits 


and right-hand limits as x >a 
are equal. 


Infinite limits. As x tends to zero in any manner whatsoever, the values of the function f(x) = 1/x? 
ultimately exceed every number, however large. One writes lim 1/x* = -+co and says that the func- 


x0 
tion tends to the limit plus infinity as x — 0. Similarly, lim (— 1/x?) = — oo; the function f(x) = —1/x? 
0 


x= 
tends to the limit minus infinity as x — 0, since it is ultimately less than any number — N (N > 0), 
no matter how large N is. 


lim if (x) = +00 iors lim \f (x) = —co)ifto every positive number N, however large, there corresponds 
a number O(N ), such that f(x) > N (or f(x) << —N) for all x with 0 < |x — a| < 6(N). 


Example: The tangent function y = tan x is not defined for x = 2/2, but has both a right-hand 
and a left-hand infinite limit for this value; lim tanx = -+co, lim tanx = —co, 
x—n/2-0 x—~7/2+0 
Limit of a function at infinity. The values of the function f(x) = 1/x + 5 clearly come arbitrarily 
close to the number 5 if the value of x is chosen sufficiently large. For example, the difference be- 
tween b and the values of the function is less than 0.000 001 for all x larger than 10°. In general, 
| f(x) — b| < e« for all x > 1/e. This example shows that the concept of the limit L of a function 
f(x) can be eee to the case of unbounded increasing (or decreasing) abscissae. 
Jim f(x) = L if to every arbitrary ¢ > 0 there corresponds a sufficiently large w(e) > 0 such 


that 1 f(x) — L| < « for all x > w(e). Similarl Yo lim a f (x) = L if to every arbitrary « > 0 there 
corresponds a sufficiently large w(e) > O such that ry, fl (x) — L| <e for all x < —o(e). 
The limits _ if (x) and — f(x) of the function f(x), if they exist, describe the behaviour of the 


— CO 
function at infinity. that of for very large positive and very large negative values of x. 
Example J: lim 1/x = 0, since |1/x —0| = |I/x|< e for all x satisfying the condition 
— oo 


x > w(e) = 1 /e. 
Example 2: The limit lim sin x does not exist. No matter how large a value of x, say Xo, is 


f— oO 
chosen, because of the periodicity of the sine function there are always infinitely many abscissae 
greater than x9 for which the function takes any prescribed value between —1 and +1. 


The behaviour of rational functions at infinity is dealt with in Chapter 5. 


Calculations with limits. The rules drawn up for calculation with limits of sequences can be carried 
over word-for-word to calculations with limits of functions. These rules, which have already been 
applied in the examples of the previous section, state that the operation of forming the limit can be 
interchanged with addition, subtraction, multiplication and division (if ZL + 0), provided that all 
the limits occurring exist and are finite. The first two rules hold also for sums and products of several 
functions, but not necessarily for infinite sums. A function h(x) whose values in a neighbourhood 
of the point a lie between those of two functions f(x) and g(x) that both have the limit L as x > a 
also has the limit L. 


if lim if (x) = Land lim 1 (x) = L, and if the inequality f(x) <= h(x) < g(x) holds in a neighbour- 
hood Of | a, then lim h(x) = A also, 


Example 1: Jim n (sin x)/x = 0. Because sin x lies 


between —1 and +1, for x > 0 holds the inequality 
—1/x < (sin x)/x <= 1/x. The result follows, since 
lim 1/x = tm a. 1/x) = 0 (Fig.). 


i= oo 
Example 2: lim x sin 1/x =0, since —|x| < x sin I/x 
< |x| and lim |x| = lim (—|x|) = 0 
x0 x= 


18.3. Limit of a function—Continuity 399 


Some important limits 


For the determination of the limit of a function there are hardly any generally applicable methods. 
Some frequently used limits will be derived here, with the help of knowledge of convergent number 
sequences. 


n n 
It has already been shown that the sequence {/a} for a > 0 converges to 1. The sequence {1/j/a} 
then has the reciprocal limit, likewise equal to 1. It follows that to every arbitrary « > 0 a positive 
integer N can be determined, so that for all n > N, the numbers a‘ and a~'/" lie in the interval 
from 1 — « to 1 + «. Since the exponential function is monotonic, all a* with —1/N< x < 1/N 
also lie in this interval. Thus, 1 — ¢< a* << 1+ 6, or |a* — 1| <¢ if |x| < d(e) = 1/N. 


It is sufficient to show, that for an arbitrary abscissa sequence {x,}-» oo, the corresponding 


ordinate sequence {(1 + 1 |xn) "} has the limit e. To this end one chooses natural numbers p, such 
that p, <X, <p, + 1 for every n. It follows that 


[1 + (pn + DP" < [1 + Afxnl® < [+ I pal? 
Now lim [1 + 1/(p, + 1)]?" =e and also lim [1 + 1/p,]’ n+” — e, The ordinate sequence under 
Pp >oo Pp 0O 
investigation is enclosed between two sequences, both converging to e, and hence has the same 
limit. 
In the above relation 1/x may be replaced by y. 


For a = 0 the statement is trivial. For a + 0, 
lim (1 + a/x)* = lim ((1 + 1/(x/a))@/9]* = lim [(1 + 1/2)7]? = flim (1 + 1/2)?" = e*. 


x—> CO 


In the last step the continuity of the function x*(x > 0) is used. 


gE 


It must be shown that to every « > 0, there corresponds a d(e) > 0, such that the inequality 
llog, x — log, a| < « is satisfied for all x with |x — a| < d(e). Firstly, since b > 1, for positive ¢ 
the numbers «, = b* — 1 and e2 = 1 — b-* are also positive. With b* > 1 it follows further that 
& = 1 — b-§ < B&(1 — b-£) = €,. Now let ¢ > 0 be arbitrary. Then one can choose d(e) = ae2 
and obtains: 


a ? 
< €3 < &, > 


x—a 
ij~— a| < aé&, ——> AS <— E> ——} —E€E; — 
ad 
Adz x+— da 
b— 1 — SF 14 
ad < < i 
From the definition of the logarithm, and because of its monotonicity, this gives 


—e < log, (x/a) << « — |log, x — log, a| <. 


—1+5°< 


<b 


Thus, the limit of a logarithm can be determined as the logarithm of the limit. Accordingly, as 
x— 0, [log, (1 + x)]/x = log, [(1 + x)'/*] — log, e, and this limit takes the value Ine = 1 for 


5.1} (aX —1)/x>lna as x0 if a>O. Ifa = 1, then the numerator is zero, 

. and the statement is true since In 1 = 0. 
Ifa + 1, put a*=1-+y. Since a*—> 1 

as x0, one has yO. But xIna 

=In(i+ y), and so (a —1)/x= 


y Ina/in (1 + y) = Ina/In (1 + y)!”], whose denominator tends to 1. The most important 
special case is a =e with Ine = 1. 


400 18. Sequences, series, limits 


‘ste eso] Equal 


Since |cos x — 1| = 2|sin?(x/2)| = 2|sin (x/2)| - |sin (x/2)| < 2|x/2| - |x/2| = x?/2<e 


for |x| < Y(2e), 
|cos x — 1] converges to zero as x > 0. 

7.[Ginx/x>1 as x70. 
The function A(x) = (sin x)/x can be included between the two func- 
tions f(x) = 1 and g(x) = cos x, which both tend to the limit 1 as 
x — 0. From the figure one can see that the area Aor, Of the sec- 


tor OEB of the circle lies between the areas of the triangles OEB 
and OED: 


ron x 


, , rE 
Aorn < Aorn Aogrp —* 1+ sinx <= l'x< tanx:! 
—r 1 < x/sinx < 1/cos x —e 1 > (sin x)/x > cos x. apse Bits derivation 
The inequalities hold only for positive x; but as x — 0, the right- of —— > lasx—0 


hand and left-hand limits exist, both with the value 1, since 


[sin (—x)]/(—x) = (sin x)/x. 
. 


For (tan x)/x = [(sin x)/x] - [1/cos x], and each factor tends to the limit 1 as x + 0. 


The rule of Bernoulli and L’Hospital 


It is known that one can interchange the operations of addition, subtraction, multiplication and 
division with the passage to the limit only if all the limits occurring exist, are finite, and in the case 
of division are different from zero. On the other hand, if by an uncritical acceptance of the inter- 
changeability there arise meaningless expressions of the form 0/0, co/oo, 0 - co, 0°, c0° or 1™, then 
it is necessary to determine the given limit directly. One speaks of an indeterminate form if one of 
these expressions arises formally for x — a. For the limit jim (sin x)/x the indeterminate form 0/0 


results if one replaces the limit of the quotient by the quotient “of the limits. However, since the limit 
of the denominator is zero, this procedure is not permissible. By other means it has already been 
shown that lim 1 (sin x)/x = 1. 


The indeterminate form 0/0. For the determination of the limit lim f (x)/e(x) in the case lim 1 f(x) 
= lim g(x) = 0, Johann BERNOULLI (1667-1748) developed a rule which the Marquis de. L’Hos- 
aa 
PITAL (1661-1704) published. 


The rule of Bernoulli and L’Hospital: If both the numerator f(x) and the denominator g(x) of a 
quotient tend to the limit zero as x —> a, and if the derivatives f’(x) and g’(x) + 0 of the functions 
f(x) and g(x) exist in a neighbourhood of x = a and the limit lim /’(x)/g’(x) of the quotient of the 


derivatives also exists, then this is equal to the limit lim f)/e@) of the quotient of the functions. 
x~a 


The rule uses the concept of the derivative, which is explained in Chapter 19. It can be deduced 
from the extended mean-value theorem mentioned there. In the expression 


f(xd/ex) = L@&) —f@NWe@) — e@] =f’ Ole’, 


— lies between a and x, and thus also converges to a as x > a. If lim if ’(é)/e’(6) = L, the theorem 
holds. It can also be used in the case x — oo. 


If an indeterminate expression of the form 0/0 again arises, the rule can be applied to the quotient 
f’(oO/g(x) and the limit lim f’’(x)/e’(x) investigated. However, it can happen that one always 


x—a 
obtains in this way an indeterminate form, or that the limit of the derivatives does not exist, although 
the given quotient does have a limit. The rule is then not applicable to the given function, and the 
limit must be found by other methods. 


18.3. Limit of a function—Continuity 401 


Rened* Ws a ee 
x<1 +— 1 xl 1 
| : 2 tke a oe is. 5 eee 

ON Fe aoe? be eae 

cosx— | _. SLX _ —COSx 1 
Example 3; tim 5 ee ax vee 5) a ee Ta 
Example 4: lim — iy ee +oo, according as x tends to the value 2/2 from 

x—n/2 COS x+n/2 —sin2x 
the left or from the right. 

In [x/(x — 1)} _ iGe— Dix} 1-1 — 177] _ x 

Example $: lim EIT — OF — tim Six? "8D 
| 1 
= lim 


watts 3. 
Example 6: For the determination of the limit lim LV (x? + sin? x)]/x L’Hospital’s rule fails. 


By other methods it is easy to show that the limit i is 5 72. 


The indeterminate form 00/oo. If the numerator f(x) and the denominator g(x) of a quotient both 
tend to infinity as x > a, then the functions 1/f(x) and 1 /e(x) both tend to zero. If f(x) and g(x) 
are differentiable in a neighbourhood of x = a, and if g’(x) is different from zero and the quotient 
S’(x)/g(x) tends to a limit, then L’HosPITav’s ‘rule can be applied: barge if (x)/2(x) = _ f'(x)/2"(x). 
This also holds for x — oo. a 


—In (x — 1) : —1/(x — 1) : 
Example 1: lim = ———— = lim ———; = lim —1)=0. 
ee ee Ikea) sane ei see 
Example 2: lim 222°% = ij jfeos*3x = ScOs x cosxsinx 
OmpNe «nj, ‘NX x+nl2 1eOS?X  x-+n/2 COS? 3X  x~n/2 —6COS 3x Sin 3x 
eae sin 2x : 2cos2x a 
gros sin 6x rex 6 cos 6x o7 3 ‘ 
Example 3: lim zane cannot be treated by L’HosPiTa.’s rule, since lim cos x does 
x oo = oo 
not exist. However, 
lim (x + sin x)/x = lim {1 + (sin x)/x] = 1. 
a= oD 
e* e e e te 
eS Ae ane ae a Tat = im gag = im Fy = ©. 
Example 5: lim Cl = lim er = +++ = lim meee O for nm a positive integer and 
xc @ x-co @* Ina x—» 00 a*(In a)" 
a> l, 
7 eae ee ax 
eer ie x" oe nx"-* zoo 


The last two examples show that the exponential function a* tends to infinity faster than every 
power x”, but every power tends to infinity faster then the logarithm. 


The remaining indeterminate forms. With the help of L’HospiTac’s rule the remaining indeter- 
minate forms can be treated, by expressing the functions in a form that leads to one of the indeter- 
minate forms 0/0 or co/co for the critical point. To calculate | lim [f(x): a for the case lim f(x) = 0, 


lim g(x) = 00, one writes f(x) + g(x) = i or f@) gx) = He a ay? and one then has the 
case 0/0 
Se erstes arccot x —1/(1 + x?) — 
Example: lim x arecot x = lim — — = lim ————__.—- = lim 


x= oD z= oD 1/x <= OD —1/x? i= oo 5 + a 
To calculate lim [f(x) — g(x)] when lim f(x) = lim g(x) = ©9, one writes 
x->a X—a xa 


_ 1 ie) 1) _ of) 
F(x) — BC) 7 eG) ~ WUf@-e@)l ~ ye) 
where _ 1 (x) = = lim »(x) = 


xa 


402 18. Sequences, series, limits 


Example: lim [1/sin x — 1/(x + x*)] = lim At sin 
Se 1 + 2x — cos x 
xao (1 + 2x)sinx +(x + x2): cos x 
2+ sin x ee 


= 2 sin x — (x + x?) sin x + (2+ 4x) cos x ne 
To calculate jim nf (x)#™ in one of the cases lim nf (x) = lim e(x) = 0; lim if (x) = 00, Jim e(x) = 0; 
or lim f(x) = =i, Jim g(x) = ©” one notes that. in each of these cases In nf (xe) = a(x) in f(x) isa 


x—a 


product of which one factor tends to zero and the other to infinity. Hence lim [g(x) In f(x)] can be 
determined by the method already known. 


Example 1: To calculate lim. x* one notes that Inx* = xIn x and lim “nx _ = lim ue 


x—~+0 I/x x= +0 —1/x? 
= lim (—x) = 0. It follows that lim x* = lim e* '"* = 1, since lim a* = 1. 
r+0 


ese x—~+0 x= 
Inx 1/x 
Example 2: To calculate lim yx, note that In yx = =— —In x and lim ——= lim —— = 0. 
X00 x40 « x+co | 
It follows that lim yx = = 1, 
a OO 
Continuity of a function a — es 

Intuitively one regards the picture of a ~ I Fede Ice 


function that is continuous on an interval J 
as a smooth curve that is nowhere broken, 
that one can draw ‘without taking the 
pencil off the paper’. This means that the 
function is defined at every point x = é of 
the interval and that it changes very little 
for small changes of the argument (Fig.). 
This idea can be made precise. 


| PE) 
i | 


| at gy 
px gis 


18.3-6 Geometrical illustration of continuity 


‘cates f(x) = 3x? — 1 is continuous at every point x = &. First suppose that |x — &| <1 l; 
then |x + &| = |2£ + (x — &)| < 2/&| + 1. From this one obtains 


f(x) —F@)| = |3x* — 1 — (BE? — 1)| = 3|x* — | = 3)x + €| |x — €| < 32/8] + 1) x — El<e 


for all x with |x — &|< HED: If for a given e > 0 one chooses 6(e) = Min [1, «/{3(2|&| + 1))], 
then one has established the continuity of the given function. 


One-sided continuity. If as x — & only the right-hand (or only the left-hand) limit exists and is 
equal to the value f(&) of the function, one speaks of right-hand (or left-hand) continuity. For example, 
f(x) = yx is continuous from the right at x = 0, since | lim We= = 0=f(0). If a function is continuous 


at x = 6, then it is continuous both from the right and ‘from the left. The converse is also true. 


Continuity in an interval. A function f(x) is continuous in an interval if it is continuous at every 
interior point of the interval and, if the interval is closed on the left (right), the function is continuous 
from the right (left) at the end-point. 

The function f(x) = 3x? — 1, for example, is continuous in every interval. On the other hand, 
the function f(x) = 1/(2 — x) is not continuous in the whole interval 1 < x < 5. For x = 2 the 
function is discontinuous; its value does not exist there. 


18.3. Limit of a function—Continuity 403 


Uniform continuity. The example of the function f(x) = 3x? — 1 shows that the 6(e) corresponding 
to a given ¢ > 0 depends in general on the value of €. If a function f(x) is such that for a given 
€ > 0 a single value 0(€) can be chosen for a// € in an interval in order to guarantee that 
f(x) — f()| < ¢, then the function is said to be uniformly continuous in this interval. Such a value 
of d(€), valid for a whole interval, exists precisely when the set of all the values of 6, corresponding 
to all the € in the interval, has a positive lower limit. For example, the function f(x) = 3x? — 1 


is uniformly continuous in the interval 2<x <5, since d(e, €) = Min 1, TOHT | > 5 . $0 
that d(e) = ay serves for every € in this interval. On the other hand, the function f(x) = tan x 


is continuous, but not uniformly continuous, in the interval 0 < x < 2/2. For a given ¢ > 0, the 
nearer € approaches to the value 2/2, the smaller must the corresponding value 6(<) be. As £ > 2/2, 
the values of 6 tend to zero. The following theorem holds quite generally. 


A function f(x) that is continuous in a closed interval [a, 5] is also uniformly continuous in the 
interval. 
The example f(x) = tan x shows that the condition on the interval to be closed is essential. 


Points of discontinuity. A point x = & at which the function f(x) is not continuous is called a 
point of discontinuity. At such a point, either the function value or the limit fails to exist, or both 
exist but are not equal to one another. Poles of rational functions are examples of points of dis- 
continuity; they are investigated in Chapter 5. 

At an indeterminate point a function assumes formally an indeterminate form. For example, the 
function f(x) = (sin x)/x has an indeterminate point at x = 0. However, since as x — 0 the left- 
hand and right-hand limits both exist and are equal to 1, one can consider a replacement function 
f*(x) that takes the value (sin x)/x for x + 0, and the value of the limit 1 for x = 0; f*(x) is then 
continuous for x = 0, and the discontinuity has been ‘removed’. One therefore speaks of a removable 
discontinuity. The discontinuity of the function f(x) at an indeterminate point x = é is removable 


if the one-sided limits lim f(x) = lim f(x) = L exist, and are finite and equal. One can then replace 
x—>&+0 x é- 


0 
f(x) by the function f*(x) = f(x) for x + €; f(x) = L for x = &, which is continuous at x = &. 

If the numerator p(x) and the denominator q(x) of a rational function p(x)/q(x) have the common 
linear factor x — xo, then x = Xo is an indeterminate point of the function. If p(x) = (x — xo)! p(x), 
q(x) = (x — xXo)* qy(x), where p;(xo) + 0, q1(xo) + O and i > k, then xo is a removable discontinuity. 
The replacement function f*(x), continuous for x = x9, is equal to (x — xXo)!~* + py(x)/q,(x) for 
x = Xo, and is zero for x = xg. For i = k the discontinuity is also removable, and the replacement 

| function is p,(x)/q,(x). For i < k, however, the re- 
placement function (x — x9)!-* - p,(x)/qi(x) has a pole 
of order (kK — i) at the point x = xo and is therefore not 
continuous there. 

Jump discontinuity. At a jump discontinuity the left- 
hand and right-hand limits are different from one 
another and the function cannot be continuous there. 

Heat supplied to a solid body raises its temperature f. 
Its heat content H at the melting point t = ¢, is not a 
continuous function of the temperature, since at this 
temperature the heat content of the molten substance is 
greater than that of the solid (Fig.). 


18.3-7 Heat content H as a function of the temperature fr; 
tm is the melting point 


18.3-8 Graph of a 
function a) with finite, 
b) with infinite jump 
discontinuity 


404 18. Sequences, series, limits 


Example 1: The function f(x) = Arctan [1/(x — c)] has a jump discontinuity of magnitude 2 

at the point x = c (Fig. 18.3-8 a), since 
lim m Arctan [I/(x — c)] = lim Arctan z= —2/2 and lim Arctan [1/(x — c)] = lim Arctan z 
i—=-o x—ac+0 f7+00 

Example 2: The divacsiets f(x) = e!/(=-©) has an infinite discontinuity at the point x = c (see 

Fig. 18. 3 3-8b), since lim e!/*-©) = lim e* = 0 and lim e!/@-©) = lim e* = +00, 
x—c-0 - r-+—-0o x—c+0 &-*+00 

Example 3: The function f(x) = 1/cos x has infinite discontinuities at the points 2/2 + kx 

(k = 0, +1, +2, ...), from +-co to —co for even & and from —co to + co for odd k (Fig.). 


Oscillatory functions with discontinuities. The function f(x) = sin (1/x) is not defined at the point 
x = 0 and is therefore not continuous there. If a positive number 6, however small, is chosen, there 
always exist in the interval —éd < x < +6 infinitely many points x = 2/(an), that is 1/x = xn/2, 
[n > 2/(x6d)] with the following property: for m, = 2k, nz = 4k + 1, ny = 4k + 3 (Kk an integer) 
the function sin (1/x) = sin 2n/2 takes the values 0, +1, —1 (Fig.). The function oscillates between 
+1 and —1 more and more rapidly, the larger n is or the nearer x approaches to zero. The function 


18.3-9 Jump from 
—oo to +0 


18.3-10 A discontinuous 
oscillatory function 


18.3-11 An oscillatory function with removable 
discontinuity 


therefore has no limit as x > 0; the discontinuity at 
that point is not removable. On the other hand, the 
discontinuity of the function x sin (1/x) at the point 
x = 0 is removable, since lim x sin (1/x¥ = 0. Conse- 


quently {*(x) = x sin (1/x) - for x +0; £*(0) = 0 is 
a continuous replacement function for x sin (1/x) 
(Fig.). 


Theorems about continuous functions. From the rules for calculating with limits the following 
theorem can be deduced immediately. 


The sum, difference, and product of two functions continuous at x = — are likewise continuous 
at this point. Their quotient is continuous provided that the denominator is not zero for x = & 


Since it is recognized without difficulty that the functions g(x) = c = constant and A(x) = x are 
continuous everywhere, it follows at once from this theorem that all functions obtained from them 
by means of the four basic operations are continuous. The first two of the following statements about 
the continuity of the elementary functions are proved in this way. 


I. Every polynomial function f (x) = a,x" + a,_,x""! +--+» + a,x + do is continuous everywhere. 

2. A rational function p(x)/q(x) is continuous at all points & for which q(£) + 0. 

3, The exponential functions f(x) = a*(a > 0) are continuous everywhere. 

4. i logarithmic functions f(x) = log, x(6 > 0; 6+ 1) are continuous for all positive values 
of x. 


18.3. Limit of a function—Continuity 405 


5. The trigonometric functions sin x and cos x are continuous everywhere; the function 
tan x = sin x/cos x is continuous for all — = (2k + 1)2/2 (k an integer) and the function 
cot x = cos x/sin x for all + kx (k an integer). 


With the help of the limit lim a* = 1 already obtained, one deduces that lim a* = lim (a* - a*~*) 
x0 x-> x—+€ 
=a‘> lim a*-— = a®+lima* = a&-1 =a. Thus, the exponential function is continuous. The 
x= h-0 
continuity of the logarithmic function follows similarly from lim log, x = log, &(€ > 0,5 > 1). Since 


x 
log, x = —log,,, x, the result holds also for 0 < 6 < 1, and thus for all admissible bases. 
Finally lim sin x = 0 since —|x| < sin x < |x|, and lim cos x = 1 as has already been shown. 
x0 


. x—0 : 
From this one obtains 


sin x = sin(€ + x — 6) = siné cos (x — §£)+ cosésin(x —§)> sin as x6, 
cos x = cos (§ + x — &) = cos é cos (x — &) — sin €sin(x —é)>cos& as x6. 


For the continuity of the functions tan x = sin x/cos x and cot x = cos x/sin x, only the zeros of 
the denominators must be excluded. 

Continuity of the inverse functions. The circular functions Arcsin x, Arccos x, Arctan x, Arccot x, 
as inverse functions of the continuous trigonometric functions, are likewise continuous, since the 
following theorem is true: 


If a function f(x) has an inverse function g(x) in an interval I, then the continuity of f(x) at the 
point x = & implies the continuity of (x) at the point x = f(&). 


n 
Accordingly the root functions yx are continuous for all positive x, since they are the inverse 
functions of the functions x" for x > 0. 
Continuity of composite functions. Let y = f[y(x)] be a composite function, whose inner function 
t = g(x) is continuous at the point x = &, and whose outer function y = f(t) is continuous at the 
point tf = t = 9(&). Then the composite function y = f[p(x)] is continuous at x = &. 


Every continuous function of a continuous function is again continuous. 
Since lim f(t) = f(x), to an arbitrary « > 0 there always corresponds a suitable number 6,(e) > 0, 
tT 
such that |f(t) —f(t)| <« for all |t — t| < 6,(e). Further, since lim 9(x) = 9(€) =t, to every 


x= 

arbitrary positive number, say to 6,(e), there corresponds a suitable number 6(6,(e)) = 62(€) such 

that |@(x) — 9(6)| < 6,() for all x with |x — &| < 62(e). Consequently, to every arbitrary « > 0 

one can choose a number 6,(e), such that | f[p(x)] — SIp(§)]| < « for all x satisfying |x — &| < 62(e). 
With the help of this theorem one can establish the continuity of many functions. For example, 


all functions Vp(x), in which p(x) denotes a polynomial, are continuous for all values of x for which 


n 
p(x) > 0, since the polynomial p(x) is continuous for all x, and the function yt is continuous for 
all t > 0. The function f(x) = e*!"* is continuous everywhere, since t = sin x and y = e* are every- 
where continuous functions. Similarly the functions Arctan (x?), cos (5x? — e****), sin (1/x) (x + 0) 
are continuous everywhere. 


Properties of continuous functions. Functions that are continuous in an interval form a class of 
functions with noteworthy properties, such as the following, which Gauss and other leading mathe- 
maticians of his time regarded as obvious, and for which Bernard BoLzano (1781-1848) published 
the first proof. 


. Bolzano’s theorem: If a function f(x), continuous in a closed interval, assumes values with opposite 
signs at two points a and db in this interval, then there exists at least one point § between a and b 
at which the function vanishes. 


The proof proceeds by enclosing such a point € for which f(€) = 0 under the given assumptions, 
in a nest of intervals. 

Bolzano’s theorem is the basis of many approximation methods for the solution of equations. 
For example, it follows from this theorem that a polynomial p(x) of odd degree has at least one real 
zero, since p(w) and p(—w) certainly have opposite signs for sufficiently large values of . 

The following are consequences of Bolzano’s theorem: 


I, A continuous function that does not vanish in an interval I must have the same sign everywhere 
in that interval. 

2. If a function f(x), continuous in a closed interval, takes the values f(a) = A and f(b) = B 
(A + B) at two points in the interval, then it takes every value between A and B at least once. 


406 19. Differential calculus 


Further fundamental properties of continuous functions are: 


If a function /(x) is continuous at x), where f(x.) + 0, then f(x) has the same sign in a certain 
neighbourhood of x, as it has at xo itself. 

Theorem of Weierstrass. A function that is continuous in a closed interval is bounded there. A 
function that is continuous in a closed interval takes both a greatest and a least value in the interval. 


In these theorems the assumption that the interval is closed cannot be omitted. For example, 
the function 1/x is continuous in the interval 0 < x < 1, which is open on the left, but it takes larger 
and larger values as x approaches 0; it is unbounded. However, it is bounded in every closed interval 
a<xx<l1(a>0); 1 <f(x) < 1/a. The theorem of Weierstrass also need not hold in an interval 
that is not closed. For example, the function f(x) = x is continuous in the interval 0 < x< 1 
which is open on the right, but since lim f(x) = 1, there is no point of the interval at which the value 


x—l 
of the function is greater than at all other points. 


19. Differential calculus 


19.1. The derivative of a function ..... 407 19.4. Extreme values of functions ..... 424 
Definition of the derivative ....... 407 Extreme values of functions of one 
The derivative as a function ...... 409 VOrIADle ccc ccccew ics vedters en 424 
The mean value theorem of the Extreme values of functions of 
differential calculus ..........4... 410 several variables ..........0000. 429 
Differential ..... 0... ccc cece 411 19.5. Applications to plane curves .... 431 
19.2. The technique of differentiation .. 413 Discussion of the curve defined by 
Derivatives of typical composite an explicit function ..........04646. 431 
FUNCTIONS v5.5 30% Soc eeeawaweN ees 413 Singular points 1.0.0.0... ceeeees 433 
Derivatives of special functions ... 417 Curvature, evolute and involute ... 434 
19.3. Derivatives of functions of several Special CurveS ......0cceccceees 437 
VAlADICS soe xiaws pectin Sancq as 420 
Partial derivatives of a function... 420 
Total differential ............405. 421 


The differential and integral calculus, jointly known as the infinitesimal calculus, are basic dis- 
ciplines of higher analysis. The objects of the differential calculus are functions, and its methods 
are the investigation and calculation of limiting values. Its central concept, the derivative of a func- 
tion f(x), is a measure of the sensitivity with which f(x) reacts to a change in its argument. Many 
geometrical problems also, such as the calculation of the gradient of the tangent to a curve or the 
determination of the curvature of a curve, can be solved with the help of the differential calculus. 

Because the relationships between quantities in the physical world can frequently be expressed 
by continuous and differentiable functions, only the differential calculus makes it possible in the 
natural sciences and technological disciplines to express mathematically not only states but also 
processes. For example, if s = f(t) describes the dependence of the distance s described by a moving 
point mass on the time ¢, then the derivative of this function represents the instantaneous speed 
of this point mass. As an extension of this idea the concept of speed can be carried over to other 
circumstances in which time plays the part of the independent variable. The concepts of heating 
or cooling of a body, of reaction speed of a chemical process, rate of decay of a radioactive process 
and rate of growth of a biological organism can be defined and calculated. For mathematics itself 
the methods and results of the differential calculus have become the basis of higher analysis. The 
development of many disciplines is unthinkable without it, for example, the investigation of the 
relationships between functions and their derivatives, the expansion of functions in infinite series, 
the treatment of differential equations or differential geometry. 

The beginnings of the infinitesimal calculus go back to the end of the 16th century; the theory 
was developed in the second half of the 17th century simultaneously, but independently, by Gott- 
fried Wilhelm LEIBNIZ (1646-1716) and Isaac NEWTON (1643-1727) as a calculus, that is, an easily 
manageable method. Whereas LEIBNIZ started with the tangent problem, NEWTON arrived at the 
differential calculus by investigating physical problems. NEWTON also recognized as early as 1665 
that differentiation and integration (see Chapter 20.) are inverse problems to one another. 


19.1. The derivative of a functions 407 


19.1. The derivative of a function 


To analyse the journey of a train from a town A to a town B, at a distance s, — sg apart, a 
distance-time diagram (Fig.) can serve as a graph of the function s = f(t), in which each point of 
time ¢; Corresponds to the distance s,; travelled by the train. The train departs from A at time fo, 
brakes at ¢,, because the signal S is at red, but does not come to rest because the signal just changes 
to green. By increasing its speed, the train reaches B on time. 

The ratio of the distance travelled (s, — s,,) to the time (¢, — t,) taken to travel this distance, 
where 1 > m, is a measure of the speed. In a graphical railway timetable the points A and B are 
joined by a straight line. Consequently a uniform average speed i = (sg — S,4)/(tg — to) is assumed. 
If tangents to the curve parallel to the line AB are drawn, then by inspection the speed of the train 
is v = 0 at times ¢,, ¢3, ts and ¢,. In the time intervals (fo, ¢,) and (tg, tg) the speed v increases, 
and in (f3, t4) and (t7, tg) it decreases. The larger v is, the steeper the curve. At the points (t,, 5;) 
and (ts, Ss) it is curved to the left, and at (t3, 53) and (¢7, 57) to the right. 

Of course, each of these statements must be made more precise geometrically, but above all this 
process of making precise leads to purely analytical statements about a function s = f(t), which 
are valid without any reference to a geometrical meaning. 


4 19.1-1 Schematic distance-time diagram for the journey 
5 / of a train from town A to town B, s distance, ¢ time 


y Ay. 
eS Fe. 
F F, 
, |--------]------ gf 
| | | 
Ss * 
p | 
lL /'\y-¥ 1% Ip 
! 
Wy |---- of ----------t4-----------5 
n | OARHAQ 
Ay Ag 
Ae ual ! 
0 ty Ky x OO 


19.1-2 Slope of a curve at the point P, 


Definition of the derivative 


Difference quotient of a function. If a curve in a Cartesian coordinate system is the graph of a 
function y = f(x), then each of its points P, (n = 0,1, 2,...) has coordinates x, and y, = f(x,), 
where the x, belong to the domain of definition of the function (Fig.). One can then form differences 
such as Ax =x; —Xo=h and Ay= y, — yo = f(xX1) — f(Xo) = f(xo + 4x) — f(xo) 
= f(xo + h) — f(x), and their quotient 4y/4x has a finite value for x; + x9. It is called a dif- 
ference quotient, and geometrically it represents the slope tana of the straight line through the 
points Po(Xo, Yo) and P(x;, ¥1), which is a secant of the curve. Here « is the angle between the 
positive x-axis and the secant, measured in the positive sense 


Ay _ %i— Yo _ f(x1) — fo) = f (xo + 4x) —f(xo) — f(xo + A) —f Xo) 
Ax x1 — Xo X1 — Xo Ax a h 
Difference quotients are frequently used, for example, in numerical mathematics as divided 
differences, in physics for average speeds or average temperature gradients. 


Derivative. If the point Po(xo, Yo) is kept fixed, but the point P,(x,, y,) moves along the curve 
towards the point Po, then the secants change their position, and the difference quotient, hence 
the angle «, change their values. If the difference quotient 4y/Ax tends to a limit as x, > Xo, this 
limit is called the derivative (=) of the function y = f(x) at the point x9. The curve of this 

XmXe 


function then has a tangent at the point Po(xo, Yo), whose position is determined by the limit 9 


of the angle «, given by lim 2 = (2) = tang. The derivative can also be denoted by 
Ax~+0 Ax dx X™=Xpo 


f'"(Xo) OF Yxux, (read f dashed of xo, y’ at the point Xo or dy by dx for x = xo). 


= tana. 


408 19. Differential calculus 


From the consideration of limits (see Chapter 18.) it follows that the left- and right-hand limits 
should be equal to one another. 


In connection with the analysis of 
the journey of a train from a town A 
to a town B it follows that the curve 
of the function s = s(t) has a tan- 
gent for precisely those points of 


time ¢ for which the limit lim As 


ds 4t+0 At 
nae’ ia tan y exists. The deriva- 
. ds, . 
tive “dE gives the instantaneous 


speed, and the curve increases 
m . monotonically because g is always 
Oo <p < 90, tang >0 -90°< 9 < 0% tang <0 greater than 0. In an interval in 
which 9 were always less than 0, the 
curve would decrease monotonically 
(Fig.). 

A function is said to be differentiable at the point x = xo if and only if the left-hand and right- 
hand limits of the difference quotient exist and are equal to one another. 

If a function y = f(x) is differentiable at the point x = x,, then it is also continuous there. 


Xo 


19.1-3 Increasing and decreasing functions 


Hence continuity is a necessary condition for differentiability, but not a sufficient condition. 
There exist functions (see Examples 3, 4, 5) that are continuous at a point but not differentiable 
there. Bernard BOLZANO (1781-1848) was the first to give an example of a function that is continuous 
everywhere, but differentiable nowhere, in an interval. 


Example 1; The function y = x? has the derivative 2x9 at the point x = xo. The difference 
quotient can be rearranged and simplified for values of 4x different from zero: 


Ax Ax Ax Ax 
This expression converges as 4x + 0 to the limit y:-x, = 2x9. 
Example 2: Differentiation with respect to the time ¢ of the distance-time function s = f(t) 
= (g/2) t? for the free fall gives the speed v,.;, = gto, because 
As _ (g/2)(to + At)* — (g/2) t3 


At At 
— g At(2to+At) _ 
=>) io + (g/2) At, 
lim <° = lim (gto + (g/2)40) = gto. 
Aéra0 I Ara 


Example 3: The function y = x!/9 is not differentiable at the — 
point x = 0 (Fig.). The difference quotient 


Ay (0+Ax)"3—0'3 (4x3 
a Ax an.) Sama fs ty li 


does not tend to finite limit as 4x 0, but increases beyond 19.1-4 Graph of the function 
every bound or tends to infinity. The tangent to the curve of the 3 


function y = x!/3 is perpendicular to the x-axis at the pointx=0. * ~ Vx 
Example 4: The function y = e!*-?! is continuous : ie: 
at the point x = 2, but not differentiable there (Fig.). 4¥ = eS Se a entasd 


Its difference quotient is Ax Ax Ax 


19.1. The derivative of a functions 409 


and this tends to the value +1 or —1 according to the sign of Ax 

(see Chapter 18.). At this point (2, 1) the curve has two tangents. 
Example 5; At the point x = 0 the function y = +-x*/?/2 is only 

right-hand differentiable, because negative abscissae do not belong 

to its domain of definition. The value of its right-hand derivative at 

x = 0 is zero, because 

Ay _ (1/2) (0+ Ax)3/2 — (1/2)- 07/2 1 (Ax)3/? 


RP cond Ee z 1/2 
Ax Ax 2 Ax 5 (Ax) 


and lim (1/2) (Ax)"!? = 0. 
Ax—~0 


~ 19.1-5 Graph of the function y = e!*-?! 


The derivative as a function 


If the derivative of a function y = f(x) exists for all points in an interval x9 < x < x,, then 
the function is differentiable in the whole interval. To each value x in the interval there corresponds 
the derivative f’(x) of the function at the point x; thus, f’(x) is a function of x, the derived function 
or the derivative. 


Example: For all values of x the function y = x* has the derivative y’ = 2x. At the points 
x, = 3 and x, = —2 it has the values y; = 6 and »y, = —4, respectively. 


Higher derivatives. The derivative y’ = f’(x) of a function y = f(x) is a function of x. Assuming 
that this is again differentiable, as is almost always the case for elementary functions, then the 
derivative of the first derivative is called the second derivative or the derivative of the second order, 


and is denoted by y” = f’(x) = = (read y double dashed, f double dashed of x, or d two y by 


dx squared). Similarly there can be a third, fourth, nth derivative, or a derivative of the nth order. 
Expressions such as ‘existence of the derivative of the mth order’ or ‘differentiable arbitrarily 
often’ are to be understood in this sense. 

The following examples can be calculated by the rules derived in 19.2. 


Example 1: y = f(x) = x* + x*/2 — 5x3/6 + x* + 5x +2; 
y’ = f(x) = Sx* + 2x3 — 5x7/2+2x+4+ 5; 
py” = f(x) = 20x) + 6x? — Sx +2; yp” = f(x) = x? + 12x — 5; 
YbV a= YO) = FIV(x) = FOU) = 120x + 12; 
yl? == fCS(x) = 120; yk) == f(x) = 0. 
2 


x Yo ap, pes. 2x : 
u ” 2x + 1 eee eee 12(x + 1 
y apg = HELD, y" =f") = - TD, 
| ype ah eg d “yp ADRES | aS tlle ee Ca 
Example 3: y = f(x) = sin x; Ge dy NX = OS; a2 as sin x = —sin x; 
3 3 ty 4 
Be ee gies Gok! Eo aes in Bi ot 
dx3 dx? dx* dx* 


All these are examples of functions that are differentiable arbitrarily often. 


_ Example 4: From the distance-time law of the free fall s = (g/2) t? one obtains by differen- 
tiation s‘ = gt and s” = g, where g is a constant, the acceleration due to gravity. Consequently 
the free fall is a motion with constant acceleration. 


=< x denotes the derivative of the speed s’ = = , that is, the acceleration. 
In the example of the journey of a train from a town A to a town B described in the introduction, the 
time intervals during which the train is accelerating can be deduced. In these the angle g between 
the x-axis and the tangent is increasing (s’’ > 0); they are the intervals fo to t, and t, to ts. In the 
intervals t, to t4 and fg to tg, however, this angle is getting smaller and the speed of the train is 
decreasing (s’’ < 0). 


Physically s”’ 


410 19. Differential calculus 


The curves of a function 
and its derivatives in Fig. 
19.1-6 are represented by the 
following table of values: 


y =f (x) = 0.1x3 — 0.6x? 
— 1.5x + 5.6; 
y =f) 
= 0.3x? — 1.2x — 1.5; 
y’ =f'(x) = 0.6x — 1.2; 
y= ff") — 0.6. 


—4 —4.4 

—3 2 4.8 

—2 5.4 2.1 

—1 6.4 0 
0 5.6 | —1.5 
1 3.6 | —2.4 
2 1 —2.7 
3 —1.6 | —2.4 
4 —3.6 | —1.5 
5 —4.4 0 
6 —3.4 2.1 
7 0 4.8 
8 6.4 


Graphical differentiation. The derivative at a point P of 
the curve given by the function y = f(x) is the value of the 
tangent function of the angle » that the tangent ¢ at the point 
P makes with the + x-axis. If a parallel ¢’ to this tangent 
through the point A(—1,0) cuts the y-axis at the point 
B, then tan gy = |OB!/|AO| = y’ (Fig.). The direction of the 
tangent at the point P can be determined with a mirror ruler 
(Fig.). The plane mirror of the ruler stands at right angles 
to the plane of the graph. The visible part of the curve goes 
over into its mirror image without a kink only if the ruler 
cuts the curve at right angles at the point P. The line 


through P perpendicular to this normal to the curve is the 
tangent ¢. 


19.1-7 Graphical differentiation 


19.1-8 Mirror ruler 


In a more complicated tool an angle scale is attached to the mirror ruler, and the direction angle 
of the tangent can be read off on it directly. In another version, the differentiograph, a pen recorder 
is attached that draws the derived curve of the given curve. 


The mean value theorem of the differential calculus 


The mean value theorem. The difference quotient [ f(b) — f(a)]/(6 — a) gives the slope of the secant to 
the curve through the points with the abscissae x = a and x = b. If the function y = f(x) represented 
by the curve is differentiable in the interval from a to b, there must exist at least one point & in the 
interval for which the tangent to the curve is parallel to the secant. Both then have the same slope, 
that is, f’(¢) = Lf(6) — f(a@)]/(b — a). If one denotes the values a and b by a = x and b= x +h, 


19.1. The derivative of a functions 411 


then € can be expressed in the form € = x + #h, where @ is a positive number less than 1,0 < ?< 1 
(Fig.). The mean value theorem then has the form 


LEP D AIC) @ pe + OM), where 0<O< 1. 


Mean value theorem: If a function y = f(x) is continuous in the closed interval a < x < 6 and 
differentiable in the open interval a < x < 5, then there exists in the interior of the interval at 
least one intermediate value — (mean value) for which [/(b) — f(a)]/(6 — a) = f’(&), where 
a<g<b. 

With the help of the mean value theorem numerical calculations can be performed, for example, 
the estimation of a value of a function from a known neighbouring value. 


Example: From f(x) = In 690 = 6.53669, f(x + h) = In 691 can be determined to five places 
of decimals. From f(x + A) = f(x) + Af’(x + #h), Af’(x + BA) is the increment to be added to 
f(x)=1n 690. Because x = 690 and x + h= 691, it follows that A= 1, and because f’(x) = — In x 
= 1/x the increment is 1 - f’(x + @) = 1/(690 +- #), which lies between 1/690 = 0.00 14492... 
and 1/691 = 0.0014471... To five decimal places both these numbers have the value 0.001 45. 
One therefore obtains In 691 = 6.53669 -+- 0.00145 = 6.53814. 


19.1-9 The mean value / 
theorem of the differen- 
tial calculus 


st Ee 


19.1-10 Geometrical 
| illustration of Rolle's 
bx theorem 


Rolle’s theorem. If in the mean value theorem the values f(a) and f(6) of the function are equal 
then there exists a value € with a < & < b for which f’(&) = 0, that is, there exists in this interval 
a tangent parallel to the x-axis. In the theorem named after Michel ROLLE (1652-1719) (Fig.) the 
additional condition f(a) = f(b) is imposed. 

Rolle’s theorem: If a function y = f(x) is continuous in the closed interval a= x <= 5 and dif- 
ferentiable in the open interval a << x < 6 and if f(a) = f(5), then there exists in the interior of the 
interval at least one intermediate value § such that f‘(€) = 0 witha << § < 5b. 


The extended mean value theorem. For the sake of completeness an extension of the mean value 
theorem is stated here, which is useful for many purposes: 


If two functions f(x) and g(x) are continuous in the closed interval a < x < 5, differentiable in 
the open interval a < x < 4, and if g’(x) + 0 in the interval, then there exists in the interior of the 
interval at least one intermediate value — such that [f(6) — f(a))/[g(6) — g(a)) = f'(€)/e’(&), where 
a<§&< 6b. 

Consequences of the mean value theorem. If the derivative of a function is zero for all points of 
an interval and if x, < x2 are two of these points, then f’(6) = [f(x2) — f(x1)]/(x2 — x1) = 0, 
or f(x2) = f(x,). The function is a constant. 

A function that is differentiable in an interval and whose derivative {"(x) vanishes everywhere in 
the interval, is constant in that interval, 


If the derivatives of the functions g(x) and w(x) have the same values in an interval, then the 
derivative of f(x) = (x) — p(x) has the constant value zero, so that f(x) is a constant. 
Two functions, that are differentiable and whose derivatives are equal in an interval, differ in that 
interval only by an additive constant. 


The theorem already obtained intuitively, that a function f(x) increases in the interval a< x < b 
if its derivative is positive there, and decreases if its derivative is negative, can be proved rigorously 
with the help of the mean value theorem. 


Differential 


Differential of a function. For a function f(x), differentiable in an interval, the difference between 
the difference quotient and the derivative at the point x9 is a function g(Ax) of 4x. From 
[f(xo + 4x) — f(xo)]/4x — f’(xo) = y(4x) the increment in the function is given by Ay 


412 19. Differential calculus 


= f(xo + Ax) — f(xo) =f'(Xo) : Ax + (Ax): Ax. It consists of the part f’(x9)-4x, which is 
linear in 4x and proportional to it and tends to zero as 4x — 0, and of the part g(4x) - 4x, which 
tends to zero ‘of a higher order’ than 4x as 4x — 0. The linear part of the increment 4y is called 
the differential of the function at the point xo and is denoted by dy = df(xo) = f’(xo): dx. The 
quantity dx = 4x is called the differential of the independent variable. 


rs . 


Example: The differential of the function y = f(x) -------------------- 
= at the point xo is dy = 2x ° dx. 

After the introduction of the concept of the differen- 
tial, the derivative ‘dy by dx’ can be represented as the 
quotient ‘dy over dx’ of two finite quantities. 

In a_ geometrical illustration (Fig.) the points 
Polxo, f(Xo)] and P, [xo + Ax, f(Xo + Ax)] lie on the 
curve of the function y = f(x). If the increment Ay 
= |RP,| corresponding to the increment 4x is cut by 
the tangent to the curve at the point Po in the point 7, 
then |R7| = dy is the differential. Clearly the smaller 
the abscissa increment 4x = dx, the better the approxi- : aoe ‘ 
mation dy for Ay. Hence the tangent can be regarded !9-!-!! The differential of a function 
as characteristic for the local course of the curve. 


y 
3 


eee 
RS eo 19.1-12 Tangents to the curve 


a with the equation y = x*/10 


The graph of the function y = x?/10 is given as the envelope of the tangents drawn at individual 
points (Fig.). 


The approximation 4y ~ dy. In the calculation of approximations one makes use of the fact 
that for small |4x| = |dx| the increment Ay of the function in the neighbourhood of the point xo 
can be replaced by the differential dy of the function at this point with good precision: the approxima- 
tion Ay + dy is valid. For the function y = sin x in the neighbourhood of x9 = 0, 4y = sin Ax 
— sin 0 = sin 4x = sin dx, dy = cos 0- dx = dx, and consequently the approximation formula 
sin dx ~ dx holds, or sin # = fA for small / if one writes A for dx. 

Occasionally differentials are said to be infinitely small. This is an unprecise and misleading 
statement, because one considers throughout finite non-zero quantities that are chosen only suf- 
ficiently small for the problem under consideration, that is, small enough to correspond to the 
required degree of accuracy. 


Differentials of higher order. Let the function y = f(x) be a times differentiable (7 > 1). Then 
its differential of the first order dy = f’(x) dx is a differentiable function of x with the derivative 
(dy) = f’’(x) (dx)*, because dx is independent of x and in the differentiation must be treated as a 
constant factor. The differential of dy is called the differential of the second order and is written 
d?y (read d two y). Similarly one can form a differential of the third order, dy = d(d?y) =f’’’(x) (dx), 
and so on. 


From this definition the nth derivative f(x) of the function y = f(x) can also be written as the 
quotient of two differentials: f(x) = d"y/dx" (read dn y by dx to the ath). 


19.2. The technique of differentiation 413 
19.2. The technique of differentiation 


On the basis of the definition of the derivative, the following steps must be taken in differentiating 
a function: form the difference quotient, rearrange it suitably, and then find its limit. But one often 
obtains the result more quickly by using general formulae for the derivatives of typical composite 
functions and special functions. To prove these, however, the steps described above must, in general, 
be followed. 


Derivatives of typical composite functions 


Derivative of a product with a constant factor. If y = cf(x), where the factor c is a constant, then 
quotient; Ay/Ax = c(4f(x)/Ax). But if a quantity 
a tends to the limit a9, then cap is the limit 
Examples: y = 6x’, y = 6° 2x = 12x; y=ansinx, y’ =xco0sx: 
l 1 
= (2/3) x*, y= (2/3): 3x? = 2x7; y=2yx, y’ =2:° 

Derivative of a sum. The difference quotient of a sum f(x) = u(x) + v(x) can be saa as 
follows: 
Ax Ax Ax Ax 


the factor c can be taken out of the difference [ esoreue | toto = [7] | 
of ca. 

2yx yx’ 
Ay _ [ux + Ax) + 0% + Ax)) — ue) +0) _ ue + 4x) — ue) a, Ax) — v(x) © 
Because the limit of a sum is equal to the sum of the limits, it follows that 


_ Ay, , 
oe 


In this derivation each of the elements in the sum can again be a sum; thus, the theorem holds also 
for the sum of finitely many terms. 


The derivative of a sum of finitely many functions is equal to the sum of the derivatives of the 
individual functions. 


The rule holds also for differences, since subtraction of a function can be regarded as addition 
of the same function multiplied by the constant factor (—1). 


Derivative of a product. The difference quotient of a product y = f(x) = u(x): v(x) can be rear- 
ranged as follows: 
Ay u(x + Ax): o(x + Ax) — u(x) o(x + 4x) + u(x) vo(x + Ax) — u(x) v(x) 


Ax Ax 
ae EOD ge way pg: SE 
Ax Ax 
The operation of forming a limit can be interchanged with the operations of addition and multiplica- 
tion; thus, the derivative is given by f’(x) = u(x) + v(x) + u(x): v’(x). 
For three factors (v,v2v3)’ = (v,v2)' v3 + 010205 = vyv2v3 + vyvZ03 + vyv203. 


This rule can be generalized to n factors by induction. ies the special case when each factor 
is equal to x, it follows that (x")’ = 1- x"! +1-x"1 +... =nx"-!, With the help of the sum, 
the product and the power rules, every polynomial function can be differentiated. 


The derivative of a polynomial function of degree n is a polynomial function of degree n — 1. 
Example 1: y = (3x? — 5x + 6) (4x? + 3x — 7) =u-v, 
= (6x — 5) (4x? + 3x — 7) + (8x + 3) (3x? — 5x + 6) =u’ 0+ osu, 
y’ = 48x53 — 33x? — 24x + 53. 


414 19. Differential calculus 


One arrives at the same result if one performs the multiplication first, giving y = 12x* — 11x3 
— 12x? + 53x — 42, and then differentiates using the sum rule. 


If non-rational, for example, transcendental functions occur as terms in a sum or as factors, then 
derivatives that will be derived later must be used. 


Example 2: y= x*-+sinx, y’ = 2xsinx + x* cos x. 

Example 3: y= x?-Inx; = y’ = 2x In x + x? + (1/x) = x(2Inx + 1); 
y’ =1-(2Ilnx4+ 14+ x: (2/x) = 2Inx + 3. 

Example 4; y = x sin x cos x = uvw; 
y’ = u'ow + uv'w + uw’ = 1 - sin x cos x + x cos x cos x + x sin x(—sin x); 
y’ = sin x cos x + x cos 2x, 


Derivative of a quotient. Under the assumption that a quotient y = f(x) = u(x)/v(x), u(x) + 0, 
is differentiable, its derivative can be deduced from the product rule. From y = u/v it follows that 
yo=uorw = yv-+ pv’. Thus, 


y = (1/v)-(u’ — yo’) = (1/v) [u’ — (a/v) + v’] = [u’v — uv’)/v?. 


: A : 
It is not hard to prove, starting out from oF as in the case of a product, that a quotient of two 
differentiable functions is, in fact, differentiable. 


From the quotient rule one can establish the validity of the formula found for the derivative 


of y = x" for negative integer exponents also, 2 = —m, m positive. Since y = x-™ = 1/x™, one 
puts u = 1, v = x™ and obtains: 

,  O—mx™! oon 2 

yy = rl = -— mx i _ nx" . 
Example 1: In the function y = ———— BE, , take wu = 3x* — 5, v = x* + 2. Because u’ = 6x 
and v’ = 4x? one obtains T 
Se 6x(x* + 2)— 4x3(3x? — 53) 2x(6 + 10x? — 3x*) 
ie G+ 2p Gt 2p 


: 
Example 2: In the function y = —> 
and v’ = 2x one obtains 


pie Sire Dee et _ eat — 2 
Sp 2 Raia, oO 
For the second derivative one must take uv = x* — 3x? and v = (x? — 1)?. Because u’ = 4x3 — 6x 
and v’ = 4x(x? — 1) one obtains 
wp _ (4x? — 6x) (x? — 1)? — 4x(x? — 1) (+ — 3x7) _ 2x? + 3) 


, take u = x? and v =x? — 1. Because wu’ = 3x? 


x? — 1)4* we: ~ G?—18 ° 
Example 3: The derivative of the function y = —— ae , with uw’ = 1/cos? x, 
‘ = —1/(cos? x) is found to be —tanx ov 
»_ (i —tan x)/cos? x + (1 + tan x)/cos* x _ fark’ = 2 
ers (1 — tan x)? ~ cos? x(1 — tan x)* ~=—s- 1 — sin 2x * 


Chain rule. As already described (see Chapter 5.), » =f[g(x)] is a composite function if 
the domain of definition of the function consists of values of x for which the values ¢ of the 
function t = 9(x) wae to the domain of definition of the function y = f(t). The difference quotient 


may be replaced by —— =~ = -2. os If the function t = g(x) is differentiable at the point &, so 
that <= = g’(&) exists, and if further the function y = f(t) has a derivative v = f(t) at the 


point t = ¢(é), then the composite function y = f[gp(x)] is also differentiable at the point £. One 
dy dy dt df _ dy = _ 
obtains —— ae ap Ge de ry ae , or f’[g(x)] = f’(t) p(x), where t = (x). 


19.2. The technique of differentiation 415 


This proof is valid only when 


Example 1: The function y = (3x? + 5)*is ofthe form y = f(t) = ¢*, where = g(x) = 3x? + 5. 
It follows from the chain rule that 


fs = —— + — = 4p? + 6x = 43x? + 5)3 + 6x = 24x(3x? + 5)5. 


Example 2: In the function y = /(5x* — 7x + 8), y = f(t) = ¢'/" and t = (x) = 5x? — 7x + 8. 
The chain rule gives 


ae ae -1/2 < 15x? — 7 
¥ = ae x! (15x? — 7) = FV LIT L®) x3 — 7x 18) 
Example 3: In the function y = sin 2x, y = f(t) = sinf and t = g(x) = 2x. One finds that 
jee AE SO. ss cng Sx: 3 = Foor. 
“dt dx 


Example 4: To differentiate the function y = In sin )/(a + bx), the chain rule must be used 
several times. Putting y = f(t) = Int, t= g(u) = sinu, u = y(v) = ov and v = a + bx, one 
obtains in succession 


df — dp dy dv 1. 
ya a ee We 
] b b cot (a + bx) 


SS EC bh a a 
sin V(a + bx) re Tes 2 V(a + 6x) 2 V(a + bx) 
Logarithmic differentiation. In certain cases it is more advantageous to differentiate not the given 
function y = f . but its natural logarithm In f(x). The chain rule gives 


Sinf) =o SO) or fW=f@)- Life). | 


f@) c ) “ 
Example |: The natural oe of the a y = x* for positive values of x is In x* = x In x. 


The product rule gives < In x* = 1-Inx += — . Thus, the derivative of the given function is 
y = x(Inx + 1). 
Example 2: For positive values of x the function y = x'/* has the derivative 


y’ = xl/s. — (In xe hl®) ws xe IF a (~ -In x) — y!/x. —a=tt == xl1-22)/x(] — In x). 


Example 3: The function y = x"[p(x)'/"]sin* x has three factors. Its natural logarithm is 


g(x) , 2cosx 
mx) * sinx ” 


In y = alnx + 1/m- In g(x) + 2 In sin x, whose derivative is —Iny = = o 


From this the derivative of the given function is obtained as 


y’ = x"[—(x)]'" sin? «|= ae we + 2cot x] 


Example 4: For y = (sin x)* one obtains 
y = (sin x) (In sin x + x cos x/sin x) = (sin x) (In sin x + x cot x). 


Derivatives of mutually inverse functions. If a function y = f(x) is monotonic and continuous in 
an interval a < x < b and has a finite and non-zero derivative f’ (x) for every x in the interval, 
then the function x = g(y) inverse to y = f(x) is also differentiable i in the corresponding y-interval, 
and f"(x): 9'(y) = 1. 


By the assumptions made, 4x and Ay in each of the two difference quotients 4x/My and Ay/Ax 


Ay A : 
have the same values, so that ee 1. By the assumption that pie ay = f(x) exists 


Ax Ay 


and is different from zero, the limit lim ed also exists and has the value I 
Ay-+90 Ay Vi (x) 


x0 Ax 


. For the geometrical 


416 19. Differential calculus 


interpretation one interchanges the variables in x = g(y). The curve y = 9(x) is then obtained by 
taking the mirror image of the curve of y = f(x) in the line of symmetry x = y of the coordinate 
system. If a tangent to the curve of y = f(x) makes an angle « with the +-x-axis, then the correspond- 
ing tangent to the curve of y = g(x) makes the same angle « with the + y-axis, that is, the angle 
B=2/2—a« with the -+x-axis. But for these complementary angles tana:tanf=1, or 


f(x): (x) = 1 (Fig.). 
y  19.2-1 Slope of the curves of mutually inverse functions 


y=pb) 


/, 


or x= ply) 


19.2-2 Graphs of the functions y = x* and y = x, 
a whose inverse functions are not differentiable for 
| x=0 


If f(x) = 0 in an interval, then the function f(x) certainly does not have a unique inverse there, 
because in that case a single value of y corresponds to all the values of x in the interval. But if 
f(x) = 0 only for individual points x, in the interval in which the inverse function x = g(y) cor- 
responding to the function f(x) is determined, then because f(x) is monotonic, f(x) cannot change 
its sign in passing through these points x,. If, on the other hand, f’(x,;) = 0 and f(x) always has 
the same sign in a neighbourhood of the point x;, then f(x) certainly has an inverse in this neigh- 
bourhood, but the inverse is not differentiable at the point x,;: for example, the function y = x? 
at the point x; = 0 (Fig.). 

The inverse function rule for differentiation is used to find the derivative of a function when the 
derivative of its inverse function is already known; for example, those of the logarithm function, 
of the circular functions, and of the inverse hyperbolic functions. 


Differentiation of functions in parametric form. A parametric representation of a function y = f(x) 
is given by x = 9(t) and y = y(t). One can then also express y as a composite function y = f[9(t)] 
of the parameter ft, and the chain rule for. differentiation yields ay ae - . In this calculation 


it is assumed that the functions y(t) and y(t) are differentiable with respect to the parameter t and 


that 9’(t) + 0. te Se) 


Example 1: The ellipse with the equation x?/a? + y?/b? = 1 has the parametric representation 


x =acosfand » = bsin ¢. From the derivatives with respect to the parameter, ae sin f 
and 2 = bcost, the derivative 2- =— acest = (—5/a) cot r is found. Because cos tf = x/a 
and sin t= y/b, one obtains = = —h?x/(a*y) as the slope of the tangent at the point P(x, y) 
to the given ellipse. 


Example 2: The common cycloid has the parametric representation x = a(t — sin’) and 
= a(l — cos’). Because & = a(1 — cost) = 2a sin? (#/2) and o = asin f = 2a sin (t/2) cos (t/2), 
its devivative a is given by = = cot (t/2). It follows from this result that at each of the 


points with tf = 2kx, (k = 0, +1, +2, ...) at which it meets the x-axis the common cycloid has a 
cusp with a tangent perpendicular to the x-axis. 


19.2. The technique of differentiation 417 


Differentiation of functions in polar coordinates. If r = r(@) is the representation of a function 
in polar coordinates, then by means of the relations x = rcos # and y = r sin # between the polar 
coordinates and the Cartesian coordinates one can go over to a parametric representation of the 
function with the parameter 0: x = r(@) cos @ and y = r(9) sin &. Thus, its derivative is given by 


— =A | =. If a dot denotes differentiation with respect to the parameter, &. = /, then: 
&. = f-sind? + rcost 
and = # cos 8 — rsin 6. [Pee ida toeues]| Fees | 


Example: The logarithmic spiral has the equation r = ae*®. By the above rule its derivative is: 
dy  ake**sind +ae*’cos® ksind + cosé 
‘dx ake@cos#—ae’sin® kcos#— sind * 
This result shows that the direction of the tangent depends only on #, so that an arbitrary position 
vector making an angle #» with the positive x-axis cuts 
the spiral at a constant angle go. 
In order to calculate the angle g between the tangent 
and the position vector OP in the general case, one takes 
the relation g = « — @ (Fig.) and deduces that 


dy 
“neon. p= co 
ae | ~ ‘Petanatne 5 ar. S 
1 + Ge tan? 


_ yoos@—sin@ rr 
 y sinO-+cos@ FF’ 


where the last expression is obtained by applying the rule 19.2-3 Angle ¢ between the tangent to 
for the derivative of a function in polar coordinates and a curve and the position vector O 
then solving for r/?. 

If one applies this result to the logarithmic, spiral (see 19.5. — Special curves), one obtains 

tan p = (a e**)/(ak e**) = 1/k. 

This means that the logarithmic spiral cuts all radii vectors at the same angle » = arctan (1/k). 
For this reason the cutting edges of the knife discs of certain cutting machines have the form of a 
logarithmic (or equiangular) spiral, to ensure a constant cutting angle. 


Differentiation of implicit functions. It is often necessary to differentiate a function defined im- 
plicitly by F(x, y) = 0. For this purpose the expression F(x, y), as a function of two variables, must 
be differentiable with respect to y for fixed x, and with respect to x for fixed y. Furthermore, if there 


exists a continuous explicit form y = f(x) for the given function, then for ey) +0, y = f(x) 


is also differentiable and its derivative y’ = f’(x) can be obtained by means of the following formula 
(see Derivatives of functions of several variables) without first finding the explicit form y = f(x). 
The round letters 0 are used to indicate that the partial derivatives of the function F(x, y) are in- 
tended. 


Example: The slope of the hyperbola given by F(x, y) = 2x? — y? + 12x —2y+3=0 at 
the point Po(2, 5) is given by the derivative of the function y = f(x) at the point x9 = 2. From 


= = 4x +12, a = —2y—2, it follows that f’(x) = —(4x + 12)/(—2y—2)=(2x+ 6)/(@+1) 
and f’(2) = 5/3. g 


Derivatives of special functions 


Derivative of the constants and of the power functions. Because every difference quotient of a 
constant function vanishes, its derivative is also equal to zero. For the derivative of the power 
function y = x", the product rule gives y’ = nx"! if n is a positive integer. Combining this with the 
quotient rule, this result can be extended to negative integer exponents. The exponent n of the power 


418 19. Differential calculus 


function may also be a rational number p/g, or in general a real number (see Chapter 2.). One then 
defines y = x?/4 = (x?P)!'/4 or y = x* = e*!"*, where the variable x is restricted to positive values. 
Using the chain rule, the derivative of an exponential function for arbitrary « is found to be 


ay 
dx 
Independently of this result, the derivative of a power with rational exponent p/g, in which the 


integers p and g have no common factor, can be obtained directly in the following steps, using the 
inverse function rule: 


: lq a dy, | | dx 1 ii if 1 \ 
y¥, = x* yl = x— Na¥y )s 
; dx | d yy ) 


" V dy < . 
y =xeet = yt — = pye-* < — : (piq) + (y?~*) = (pig) xP-O!4 = (pig) xP/@-, 


= etlnx «gy» (1/x) = a. (x*/x) = axe}, 


Examples: 


| derivative function 


| derivative 


y=x y=] y = —4x5 
y=2x-—1 y =2 y’ = 1/(3x?!9) 
y= —x/2+2 y = —1/2 Y= 2 prx2/2-1 
yor y=15e" yy =ax™! 


Derivative of the exponential function. The limit, as x tends to zero, of (e* — 1)/x was found to 
be 1 (see Chapter 18.). But this expression occurs in the difference quotient of the exponential 
function y = e*: 


Ay  et4x—ex er e4* — e* aes e4* — 1 | | 
Ax Ax 7 Ax 7 Ax " 


Here Ax tends to zero, but e* is a constant for each arbitrary, fixed value of x. The difference quotient 
therefore tends to the limit e*. It is the only derivative that is equal to the function. For this reason 
the exponential function y = e* is appropriate for the description of natural events, for which the 
variation y’ of the given variable y is equal or proportional to y, for example, in the decay of radio- 


active material. From the chain rule it follows that ae e** = k e*~ (k is the factor of proportionality). 


The derivative of the general exponential function y = a* = e*!"¢ is given by the chain rule: 


dy _ -exina 


Derivative of the logarithmic function. The inverse function of the logarithmic function y = log, x 


dx 
is the function x = a” whose derivative is aa = q’ Ina. The function x = @” is monotonic and 


its derivative is never zero for finite values of the variable. It follows that the reciprocal of its derivative 
is the derivative of the logarithm (see Chapter 2.): 


dy dx cd wey eT 
ae) ae = 1/(@” Ina) = 1/(x Ina) = (1/x) log, e. ax log, x Pre ae log, ¢ 
d i 
dx XX 


Derivatives of the trigonometric functions. The difference quotient is formed and rearranged in 
preparation for the passage to the limit: 


Ay - sin (x + Ax) — sin x - 2 cos (x + Ax/2) sin (Ax/2) 7 ia Aa sin (Ax/2) 
ax Ax i Ax OE ae aes 
jJHere the formula of trigonometry 


sina —sinB = 2cos[(« + P)/2] sin [((« — B)/2) 


is used, with x + 4x =a and x = 8. 
As Ax -> 0, the quotient sin (Ax/2)/(Ax/2) tends to the limit 1. Thus, 4y/4x tends to the limit 
cos x, as 4x — 0, since the cosine function is continuous. 


19.2. The technique of differentiation 419 


The derivative of the function y = cos x is obtained by a corresponding rearrangement. Because 
y = tan x = sin x/cos x and y = cot x = cos x/sin x, the derivatives of these functions can be 
found using the quotient rule. The derivatives are valid, however, only for values of x for which 
cos x or sin x are different from zero, and hence not for the values x = (2k +- 1) 2/2, or 
x = 2k - 2/2 = kn, where k can be any integer. 


Derivatives of the circular functions. The function y = Arcsinx with —1<x< +1 and 
—n/2<y<-+2/2 is the inverse of the function x = sin y, which is continuous and mono- 


dy dx 
tonic in the given interval. Hence its derivative is given by —— a = | IS a = 1/cos y= 1/V/(1 — x”). 


dx 
Because of the condition ey + O it is necessary to restrict the result to the open interval —1< x<-+1. 


If the function x = sin y is inverted in another of its intervals of monotonicity, for example, in 
—n/2+kn<y<-+n/2+ kx with k an integer, then y = (—1)* Arcsin x + kz is the inverse 


. ; —1)* a : . 
function and its derivative is = cae Similarly the function y = Arccos x with 
—l<x<+I1 and 0< y< 7 is the inverse of the function x = cos y, and its derivative in the 
interval —1<x< +1 is oo = = =— Va=2)" The inverse of the function x = cos y 


in the interval kxn<y<(k+1)z, where k is an integer, in which it is monotonic, is 
: . .. . dy (—1)*+! 
= (— 1) = SS 
y = (—1)* Arccos x + kx. This has the derivative dx ya — x) 
derivatives of y = Arctan x for —n/2 << y< +2/2, y = Arccot x for 0 < y < 2 can be obtained, 
and the method is also valid for the intervals —n/2 + ka< y< +tn/2+ka,kn<y<(k+1)z, 
respectively. 


. In a similar manner the 


Derivatives of the hyperbolic functions. These functions are defined as rational functions of the 
exponential function, and can therefore be differentiated using the sum and quotient rules. For 
example, 

d sinh x 


d x —x = (e* —x = 
ix =e — e-*)/2 = (* + e*)/2 = cosh x. 


d sinhx cosh? x — sinh? x 


From y = tanh x = sinh x/cosh x, , where cosh? x —sinh? x= 1. 


‘dx coshx cosh x2 


Derivatives of the inverse hyperbolic functions. By the inverse function rule, the derivatives of the 


inverse hyperbolic functions can be found from the relation ay = | / a for example, 
d sinh“! x d sinh y 2 fd dx =} dy of 
————— 1 a = 1/cosh y = 1//(1 + x?). A similar procedure is used for y= tanh7! x 


in the domain of definition |x| << 1, and for y = coth™! x in |x| > 1. One obtains the derivatives 
displayed here, which represent different functions in spite of their formal equality, because they 


420 19. Differential calculus 


have different domains of definition. The function y = cosh~! x is the inverse function of x = cosh y 


~1 
in the interval of monotonicity 0 < y< +00; hence cc ==] a = 1/sinh y 


dx dy 
= 1//(x? — 1). The function. is differentiable for all x in the domain of definition x > 1 with the 
exception of x = 1. In the interval of monotonicity —oo << y< 0, x =coshy has the inverse 
1 


J ° ° ° ° dy 
function y = —cosh7! x, and con ntly its derivative is —— — — ———_____ 
y sequently its derivative is —— Wo? 1) 


Derivative of an integral with respect to a limit. In the integral ff© dé, the number a is fixed 


a 
and the upper limit x is variable; consequently the integral is a function ©(x) of its upper limit. 
The derivative of the integral with respect to this variable upper limit is equal to the function value 
f(x) of the integrand at this upper limit (see Chapter 20.). 


19.3. Derivatives of functions of several variables 


Partial derivatives of a function 


In the function z = f(x1, x2, .--, Xn)y X1, X25 -+-) X, denote variables that are independent of one 
another; for example, in z = f(x, y) the variables are x; = x and x, = y, and in z = f(u, v, w), 
x1 =U, X2 =v, X3 = w. If one regards all the variables except one, x; say, as constants, X1,05 
X2,09 +++» Xt-1,0 X141,0 --+» Xn,0, then the function becomes a function of one variable. If this func- 
tion is differentiable, one can form a partial derivative, denoted by 


dz 0 
“Ox, = Bx, [1.0 ceeg Xgy cory Xn,o) = fa,3 


the round letters 0 indicate the partial derivative (read partial 0 f by dx;). For z = f(x, y), for example, 


0 _ ye S&+Ax, yo) — Ff, yo) 0 _ un So, ¥ + Ay) —f(xo, y) 
a 00> eo SCC‘ 


whenever the two limits exist. The rules for differentiation of a function of one variable hold with 
respect to the single non-fixed variable x;,. 


Example I: z = f(x, y) = x? + 7x*y + 3xy? — Sy; 


ai = f(x, y) = 3x2 + l4xy +. 3y°, = = f(x, y) iol Tx? ae 15xy* = 30y5, 
— : dz —— = Wek melee 2 He = = SR * 
OR a: FLEE EE OI ae +(x/y? y x+y?" 
ae 1 fe eee x P 
coc ite oo ee 
Example 3: w = f(x, y, z) = V(x? + y? + 27); 
ae 
CR (Come nr 
bi Sala Serene a ok ee 
oy Ve 27)” 
aw z 


az ®VE Y +e) 

Geometrical significance of the partial de- 
rivatives of a function of two variables. A 
function z= f(x, y) of two variables can, in 
general, be represented by asurface in space. 
The assumption y»y = yo = const selects 
the points of this surface that lie at the same 
time in the plane y = yp parallel to the x, z- 
plane. They form a plane curve, and 


3 7 ao Serie a 
mate 7 f the t tt, to  19.3-1 Geometrical significance of the partial derivatives 
ax Pia a) a Ee alope OF the taneene fs of a function of two variables 


19.3. Derivatives of functions of several variables 421 


this curve at the point (x, yo): Z, = tang ,,,,. The angle gx,y, describes the slope of ft, to the 
+x-axis (Fig.). Similarly z, = Sy f(Xo, y) = tan Yx,,y is the slope with respect to the + y-axis of 


the tangent f, to the curve in which the surface z = f(x, y) is cut by the plane x = xo parallel to 
the y, z-plane. At each point P of the surface both a tangent ¢,; determined by z, and a tangent f, 
determined by z, are defined. Under certain assumptions, which are usually satisfied in practice, 
the two tangents span a plane to the surface at the point P. 


Partial derivatives of higher order. Each partial derivative is again a function of the same variables, 
and can itself have partial derivatives if the limits of the corresponding difference quotients exist. 
These are called partial derivatives of higher order, for example, of the second, third, ... mth order. 
Partial derivatives with respect to different variables are called mixed derivatives. From z = f(x, y), 


for example, by differentiating — = f, and ae = f, one obtains four derivatives of the second 
order: dx dy 

_ of, 07z a!) Pa of, 07z _ 07z 
fay = gt? Fo By tty? ear tyes OM oH ay He 


From the way in which they are formed, the functions f,, and f,, are different from one another. 
But Leonhard Euter (1707-1783) already knew conditions under which they are equal; Hermann 
Amandus SCHWARZ (1843-1921) proved the theorem named after him. 


Theorem of Schwarz: If the mixed partial derivatives of the second order /,, 
and f,. of a function f(x, y) are continuous functions of x and y in a domain D, [ tore | 
then they are equal to one another in the interior of this domain. 


The continuity of a function u=/f(x,y) at the point (x9, ¥o) means that correspond- 
ing to an arbitrary prescribed ¢ > 0 there always exists a positive number 6 = 6(e) such that 
If(x, y) —f(xo, ¥o)| < for all pairs of numbers (x, y) satisfying (x — xo)? + (vy — yo)? < 6?. 
Geometrically this means that the values of the function f(x, y) differ by an arbitrarily small amount 
from f(x9, Yo) provided that the argument (x, y) is chosen within a sufficiently small circle with 
centre at (Xo, Yo). If f(x, y) is continuous at every point of a domain D, then the function is continuous 
in D. The theorem of Schwarz holds also for partial derivatives of higher order and also for functions 
of more than two variables; for example, for z = f(x, y), fury =fxyx =Syxx and fryy =Syxy =Syyx- 


Example I: 
z= f(x, y) = x° + Tx?y + 3xy® — Sy®, fe = 3x? + ldxy + 3y*; 
fy = Tx? + 15xy* — 30y?; ff, = 6x + 14y; fy = 14x + 15y* =f; 
Sx = 60xy* a 150y*; S ces = 6; Jaws =f xyx =f yxx = 14; 
S xyy = foxy =f yyx 7 60y*; Save = 180xy? = 600y°. 


Example 2: 
| x 
z= f(x,y) =aretanx/y; fe = oF = in coca Se 
2xy x? — y? 2xy 


fa Gap yp Gap yp let lo Gre HP 


Total differential 


Total differential of the first order. If z = f(x, y) is a function of the two variables x and y and if 

its partial derivatives of the first order f, and f, exist, then these are the limits of the difference 
: A,z By A,z Zz Ayz dz 

quotients mS and Ag Thus, Aa oe + ,(4x) and Ay oy + (My), where 


91(Ax) and g2(Ay) denote the differences between the corresponding difference quotients and the 
partial derivatives, as in the case of the total differential of a function f(x) of one variable. If these 
equations are solved in the same way for the increments of the function z = f(x, y), the equations 
A,z= — ‘Ax + 9,(4x)-Ax and A,z= = ; 
pressions the terms y,(4x) - 4x and 92(Ay) + Ay are small ‘to a higher order’ than the partial dif- 


ferentials — dx and = dy. Consequently, from the total increment 4z of the function z = f(x, y), 
0z 


. dz : : : 
if ax and ay are continuous and terms of higher order of smallness are neglected, one obtains 


Ay + 9,(Ay)-Ay are obtained. In these ex- 


422 19. Differential calculus 


Az = f(x + Ax, y + Ay) — f(x, y + Ay) + f(x, »y + Ay) — f(x,y), 
f(x + Ax, y + Ay) — f(x,» + Ay) —e f(x,y + Ay) — f(x, y) : 
4zi= ee AX + amine eee (s ) 
d 
= ae +- 9, (4x)° Ax 4 x Ay +- 2(Ay)* Ay 
ox ) 


oy. oar 
ax ay 


The larger the chosen values of 4x and Ay, the larger is the difference between the total differential 
dz and the total increment 4z = f(x + Ax, y + Ay) — f(x, y). For the function z = x? — y? one 
has dz = 2(x dx — ydy). The table shows the difference 4z — dz in the neighbourhood of the 
point x = 2, y = 1 for each of the pairs of values 2, | and 0.2, 0.1 for 4x and Ay. 


7 a mR 
a CC 
se—ol «oa rea on eae To foe 
te [a papre [2 [ay «| [os 
: ep rot 


(x + Ax)? | 16 | 4.84 }—/(y + Ay)? | 4 1.21 F(x, y)+Az| 12 3.63 


Geometrical significance of the differential of a function of two variables. The partial differential 
d,z represents the increment in the ordinate of the tangent to the curve z = f(x, yo) and the partial 
differential d,z the increment in the ordinate of the tangent to the curve z = f(%o, y). The total 
differential dz = ccs dx + oe dy is a func- 

Ox oy 
tion of four variables (x, y, dx, dy) and 
represents geometrically the increase that 
results in the ordinate of the point of con- 
tact of the tangent plane with the surface 
z = f(x, y) if x is increased by 4x = h = dx 
and y by Ady = k = dy (Fig.). 


Example: For the function z = f(x, y) 
== x? -}- 1x2Y, + 3xy5 — Ae eo oe 
+ l4xy + 3y° and 24 15xy* 
~ 30y%. It has the total dif differential 
dz = (3x? + l4xy + 3y°) dx 

+ (7x? + 15xy* — 30y5) dy. 
If z = f(x, y) = 0, then the function can 
be regarded as an implicit form of a func- 
tion of one variable. Because dz = 0, the 


= . . e . s . : d . _ . 
reed oe ee of the differential of a derivative = of the implicit function of 
; — 0 
one variable is given by —— of dx + —. - dy = 0, = --+/2 (see p. 417). 


Differentials of higher order. If the partial derivatives of a function are themselves continuous 
and differentiable, then again a total differential d?z of the total differential can be formed. It is 
called the total differential of the second order. In the differentiation the finite, arbitrarily chosen 
quantities dx, dy are treated as constants. One obtains 


0 / dz 0z 0z 
2 eae pai i as 
dz = 5 (dx + dy) dx +- S- (-5= ax x+ ay”) 
2 2 < 
Gs. 0°z diay ee 5 


19.3. Derivatives of functions of several variables 423 


Since z,, = z,, by the theorem of Schwarz, the differential assumes a form in which the coefficients 
and the products of dx and dy are formally given by the binomial theorem. 


(2) 
For the total differential of the second order, for example, one obtains d?z = (=: dx + > dy 2. 
The expression inside the brackets is multiplied by z in the sense of an operator. It can be shown 


that the total differentials of higher order, for example, of the mth order, are given by this formal 
relation. 


le: The function z= f(x, y) = x* + 7x*y + 3xy* — Sy® has the partial derivatives 
ae = 3x? + Ld4xy + 3°; zy = Tx? + 15xy* — 30y*®; z= 6x+ 149; 2, = 14x + 15y*; 
Zyy = 60xy* — 150y*. Consequently its total differential of the second order is 


d?z = (6x + 14y) dx? + 2(14x% + 15y*) dx dy + (60xy? — 150y*) dy?. 


Solution of equations in several variables. The equation 3x — 4y + 5 = 0 can be regarded as an 
implicit form of the equation of a function whose explicit form y = 3x/4 + 5/4 is easily obtained. 
In general, from a given equation F(x, y) = 0 it is required to find a function y = y(x) of one variable 
for which the equation F[x, »(x)] = 0 is satisfied identically. This solution may be possible by means 
of elementary functions or by the application of limiting processes, such as infinite series. For 
x? + y? + 1=0, for example, it is not possible. It may also happen that in the neighbourhood 
of different points (xo, ¥o) with F(xo, yo) = 0 different solutions for y exist. For example, the 
equation F(x, y) = 5x? + y? — 9=0 has the solution y = /(9 — 5x?) in the neighbourhood of 
the point (1,2), and the solution y = —//(9 — 5x?) in the neighbourhood of (0, —3). 


The equation F(x, y) = 0 determines in a neighbourhood U(xo, Yo) of the point (xo, ¥o) with 
F(xo, ¥o) = 0 exactly one continuous function y = y(x) with the properties yo = y(xo) and 
Fix, »(x)] = 0 for all x € U if the following conditions are satisfied: I. the function F(x, y) is con- 
tinuous in U(X, Yo); 2. the partial derivatives F, and F, exist and are continuous, 3. Fy(xo, Yo) ¥ 9. 
The function y = y(x) is then also differentiable and y’ = y'(x) = —F,/F,. If the given function 
F(x, y) has continuous partial derivatives up to the kth order, then y = y(x) is also k times continu- 
ously differentiable. 


These results can be extended at once to functions of more than two variables. If F(x,, x2, .--, X,) 
is a continuous function with continuous partial derivatives Fx, in a neighbourhood of (x9, x2, ..., xP) 
and F(x9?, x8, ..., xP) = 0 with Fx,(x?, x$, ..., x2) + 0 for a fixed j, then there exists in a neigh- 
bourhood of (x9, x$,...,xf) a continuous function x, = f(x1, ..., Xj-1) Xj415 ---» X_) With 
x$ = Te; er x91, xP 44s ay xe) and F(x, coey Myut ee Xj4is cers X,) = 0. 


Example 1: The equation F(x, y) = e” — e~-” — 2x = 0 can be solved for y in a neighbourhood 
of (0,0), because F(x, y) is a continuous function with the continuous derivatives F, = —2, 
F, = e&” + e- and F(0, 0) = 0, F,(0, 0) = 2 + 0. The solution is y = In (x + x? + 1) = sinh“ x. 

Example 2: The equation F(x, y) = x* + y*? — 3axy = 0 for the folium of Descartes cannot 
be solved for y in the neighbourhood of (0, 0) because F, = 3y? — 3ax, and hence F,(0, 0) = 0. 
One gathers this intuitively from the graph of the function (see Fig. 19.5—-6). 


The following theorem gives conditions under which a system of m equations F, (x1, --+>Xn3 
Vis +++) Ym) = 0, i = 1, 2, .... m, can be solved for the m functions y,, y2, .--, Ym in a neighbourhood 
of the point (x2, -.., x93 VY, +++) Ym): 

If the functions Fy(x,, «.«) Xn} ¥is +-+s ¥m) for i = 1, 2, -.., m are continuous in a ~ eg engin aks U 
of the point (x°,...,x2;y9,..-,¥2) and have continuous partial derivative. bat) : o at that 
| : J ke 
point, and if the functional determinant Det a O8  i05 Sea va) formed from the partial 
“t 


& 
derivatives aa at the point (x°,...,x°; y®, ..., »2) is different from zero, then there exists in T/ 


424 19. Differential calculus 


exactly one system of m differentiable functions y, = y,(x,, -- 


with the properties that y? = y,(x®, ..., x®) 


BOG) Fait s -=-5 Mao Fie ie 3915 ade ory Pee 


Example I: The system of three equations given 
here represents the connection between the Carfe- 
sian coordinates (x, y, z) of a point and its spherical 
polar coordinates (r, #, ~). The Jacobian is given by 


siné?cosp  sin#sing 
D= 


—rsin#? sing rsin®cosp 


» Xn)] = 0. 


cos #| 
rcos@cosp rcos#sing —rsin#? 


Functional determinant 
or Jacobian 


+» Xn) 


=r’ sin? 
0 


and D + 0 for all points that do not lie on the z-axis. For these points the system can be solved 


for r, #, p, giving 
r= r+»? +24); 


x 
= BECO Ta 


# = arccos 5. 
V(x? + y? + 27) 


Example 2; Generalization of J. If the functions y, =f, oy Xp) for K = 1, 2,...,” and 
their partial derivatives of the first order are continuous in a neighbourhood ‘of the point 


(x®,...,x®) and the Jacobian Det oe eT at, “| + 0, then there exist continuous func- 


tions X= XO» ee +s Yn) with xy? « 
Xn( Vis +++» Yn)) = Ye for k = 1, 2, .. 
has (speaking intuitively) an ‘inverse system’. 


19.4. Extreme values of functions 


Extreme values of functions of one variable 


From the graph (Fig.) of the function 
y = f(x) = "/6(x? — 3x? — 9x + 17) 
one recognizes that in the range between —oo 
and —1 the ordinate value increases as the abs- 
cissa increases. On the other hand, from x = —1 
to x = +3 the ordinates decrease steadily, and 
for x >3 they again increase steadily. In a 
suitable chosen neighbourhood of the point 
Xmax = —1, the value f(x) of the function is less 
than f(Xmax) for all values of the abscissa x diffe- 
rent from Xmax» The function is said to have a 
local maximum at the point x,,,4,. At the point 
Xmin = +3, the function is said to have a local 
minimum, because the value f(x) of the function 
is greater than f(x,,;,) for all values of x diffe- 
rent from Xin in a suitably chosen neighbour- 
hood of this point. Both values, maximum and 
minimum, are called local extreme values (or 
extrema); local, because there are places at 
which the function assumes greater values than 
f(—1) = +11/3 and smaller values than f(+3) 
= — 5/3. In the closed interval 


~~) = — xe and fleiQi, veil “> Yn)» x2(V1 F Mlledad 3 Yn) orp 
wy Me Thus, under the conditions stated, a system of equations 


| 
—— 
lrttht | 


19.4-1 Extrema and points of inflection of the graph 
of the function y = f(x) = (1/6)(x* — 3x? — 9x + 17) 


—2 <x < 4, however, they are the absolute or global extreme values. 


By the theorem of Weierstrass, a function that is continuous in a closed interval assumes its 
supremum and its infimum in that interval. These absolute extrema, however, can occur at the 
Pauee points of the interval, for example, for the function considered above in the interval 
—5<x< 10. 


19.4. Extreme values of functions 425 


Conditions for the occurrence of local extrema. As the argument x of a differentiable function 
f(x) passes through the point x,, of a local extremum, the sign of its derivative f’(x) changes. If 
Xm = Xmax is the position of a local maximum, then f(x) > 0 for x < xma,, because for these 
arguments the function f(x) is increasing, and f’(x) < 0 for xmax << x, because f(x) is decreasing. 
If the derivative is continuous, then f’(x,,,,) = 0 must hold. Similarly for the position x,,;, of a 
local minimum, f’(x) < 0 for x < Xmin and f(x) > 0 for xmi, << x, and consequently for a con- 
tinuous derivative f’(xmin) = 0. 

It follows that a necessary condition for a local extremum is that the derivative f’(x) vanishes. 
Only the sign change of the first derivative f’(x) discussed above, expressed analytically by f’’(x,ax)< 0 
or f’(Xmin) > 0, guarantees the existence of a local maximum or minimum. Thus, it is sufficient 
for their existence that f’(x,,ax) = 0 and f’’(Xmax) < 0 or f’(Xmin) = 0 and f’’(Xmin) > O. In the case 
when f’’(x,,) = 0, the higher derivatives , - 
of the function f(x) can be taken into 
consideration. 


Extrema at zeros of the second and higher derivatives. If f’’’(x) is the first non-vanishing derivative 
for x = x,,, then it follows from the discussion above that the function f’(x) has a minimum at 
this point if f’’’(x,,) > 0, or a maximum if f’’’(x,,) << 0. In both cases, since f’(x,,) = 0, the curve 
of f’(x) touches the x-axis at the point x = x,,. For the graph of the function f(x) this means: 
when the argument x passes through the point x,, in the sense of x increasing, then if f’’’(x,,) > 0, 
the slope tan ¢ of the tangent decreases from positive values to gy = 0 for x = x,,, and then increases 
again, and if f’’’(x,,) << 0, tang increases from negative values to y = 0 for x = x,, and then 
decreases again (Fig.). Such points are called horizontal points of inflection. 


A function f(x) that is at least n times differentiable (n >2) 
at the point € has a local extremum at that point if n is even 
and f’(&) = f’"(€) = +++ =f") = 0, but f(E) + 0; if 
f©™(E) <0, a local maximum occurs, and if f'"(€) > 0, a 
local minimum. 


F fx) 


If one expands f(x) about the point € using Taylor’s Am ‘m 
theorem, then since f’(é) = f’(6 =: = f@ 28) = 0, y'! 
F(x) — fF) = Ch" f(n— 1!) - fO PE + Oh) with OO < 1. 

Because f‘(x) is the derivative of f("-(x), for f() f(x) 
< 0, f“-(x) decreases monotonically through the value x 


f (8), but for f“?(6) > 0 it increases monotonically. The 
sign of 4, on the other hand, is always negative to the left 
of € and positive to the right; the same holds for A"~}, 
since n — 1 is odd. Consequently, as the following table 
shows, the remainder term, and hence the difference 
f(x) — f( is negative on both sides of the point x = & for 
f ™ (&) < 0, so that (6) > f(x), and the function has a local 
maximum, whilst f(6) < f(x) on both sides of é if f (6) >0, 
and the function has a local minimum. 


| An 7 
| 


fez) 


f"x) 


f(x) 
frre) <0 
fe) > 0 


; ; 19.4-2 Schematic representation for 
Points of inflection. The tangent to the curve of the f(x) = 0;/f’(x) =0;f’(x) +0 


function y = (1/6) (x? — 3x? — 9x + 17) (see Fig. 19.4-1) 
at the point (—3, —5/3) has the slope f’(—3)= +6. The slope decreases to the value 
f’(—1) = 0 at the maximum (—1, +11/3), and on passing through this point decreases still 
further as far as the point x,, = +1, where it has the value f’(1) = —2. From there on the slope 
increases monotonically. The function f’(x) has a local minimum at x,, = 1. In the interval —co 
<x < +1 the direction of a tangent moves into a direction of a following tangent, in the sense of 
increasing values of x, by a rotation to the right, in the mathematically negative sense. In the interval 
+1<x< +00 there is a corresponding rotation to the left, in the mathematically positive sense. 
At the point of inflection the sense of rotation changes from right to left. 

If one imagines driving a vehicle along the curve, then the road lies to the right of the tangent 
as far as the point of inflection, but after one has passed through this point it lies to the left of the 


426 19. Differential calculus 


tangent. The curvature of the curve before the point of inflection is opposite in sign to that after 
it. If a portion of a curve through three points situated sufficiently close to one another is replaced 
by a circular arc, then the centre of this circle of curvature lies to the right of the curve before the 
point of inflection and to the left of it afterwards. The tangent at the point of inflection or inflectional 
tangent thus separates portions of the curve whose curvatures are in opposite senses. 

From these considerations it follows that the function has a point of inflection where its first 
derivative assumes an extreme value. If in addition f’(x,,) = 0, then the inflectional tangent is 
horizontal; one speaks of a horizontal point of inflection (Fig.). 

The criteria for extrema can be applied to the first derivative f" (x), regarded as a function P(x). 
Consequently a sufficient condition for a local maximum or minimum of f’(x) at the point x,, is 


P (Xw) =f" (Xy) = Oand y(xy) = f(xy) <0 or f(xy) > 0. If f’”’’(x,) = 0, then the last theorem 
above holds, since a derivative of odd order of f(x) is one of even order of g(x). 


19.4-3 Graph of a function 
with a horizontal point 
/aflectional of inflection 

tangent 


19.4-4 Points of inflection W,, W, and inflectional tangents 
t,, t, of the graph of the function 
f(x) = (x4 — 2x? — 12x? + 8x + 20)/10 


The function y = f(x) has a point of inflection at a 
point € at which f"'(€)=0 if the first non-vanishing 
derivative f‘(£) (n > 2) is of odd order. 


Example 1: The function y = f(x) = 0.1(x* — 2x3 — 12x? + 8x + 20) has the derivatives 

y’ = 0.1(4x° ~ 6x3 — ed y” = 1.2x? — 1.2x — 2.4 and y’” = 2.4x — 1.2. From y” = 0, 
that is, x? — x — 2 = 0 one obtains x, = —1 and x, = +2. Because f’’'(x,) = —3.6+ 0 and 
f"(x2) = +3.6 + 0, x, = —1l and x, = +2 are abscissae of points of inflection (Fig.). The 
corresponding ordinates are f(x,) = +0.3 and f(x.) = —1.2. 

The inflectional tangents t, and ft at the points of inflection W’,(—1, +0.3) and W,(+2, —1.2) 
have the slopes f’(x,) = +2.2 and f’(x,) = —3.2 and hence have the equations (y — 0.3) 
= 2.2(¥ + 1), or y = 2.2x + 2.5 for t, and (y + 1.2) = —3.2(x — 2) or y = —3.2x + $.2 for fp. 

Example 2: The function y = f(x) = (x? — 4)/x has no point of inflection, because f(x) = 0 
is a necessary condition for a point of inflection, but »’’ = —8/x? cannot have the value zero for 
any finite value of x. 


7 Applications. If one succeeds in expressing a variable fas 
Yy a continuous and differentiable function of a variable x, 

then one can calculate for which value of x the variable f 
has an extreme value. From the given conditions it can be 
determined whether this value is a local maximum or 
minimum. 

In applications, however, it is usually required to find 
absolute extrema. If the function f(x) is continuous in the 
closed interval a< x <b and differentiable in the open 
interval a< x < b, then its absolute minimum (or maxi- 


a Fa 
Y Y mum) is either the smallest local minimum, the greatest 


ff local maximum or one of the boundary values f(a) or f(b). 


19.4-5 Box of greatest volume made Example 1: A box is constructed from a square of 
out of a square cardboard of side a by cutting away four squares 


19.4. Extreme values of functions 427 


from its corners and folding the resulting rectangles (Fig.). The four shaded equal squares 
can be used to stick the carton together. How big must these squares be for the volume V of the carton 
to be as large as possible? - 

By the formula for the volume of a rectangular box V is given by 

V = y= f(x) = x(a — 2x)? = 4x? — 4ax? + a*x. 
This function can have extreme values only for y’ = 12x* — 8ax + a* = 0, that is, x* — 2ax/3 
+ a*/12=0; thus, x, = a/6, because for x; = a/2 the cardboard would fall apart. From 
y”’ = 24x — 8a it follows that f’(x,) = —4a < 0, and thus x, = a/6 is the value of the abscissa 
of a local maximum. This is also the absolute maximum; the cut must be 1/6 of the length of side a. 

Example 2: What must be the dimensions of a cylindrical preserve tin so that for a given content 
V the smallest possible amount of sheet metal is required for its construction? — A right circular 
cylinder is determined by the radius r of its circular base and its height A. Its surface area S must 
be expressed as a function of one variable. The second variable in S = 2mr? + 22rh can be elimin- 
ated using the given additional condition V = ar7h. With h = V/(ar?) one obtains: 

S=y=f(r)=2ar7+2V-(i/r), yy =4ar—2V/r?,) y= 404+ 4V/r°. 
A local extremum can occur only for 4ar,; = 2V/r? or ry = (V/22)'/5. 

Because f’’(r;) = 122 > 0, the surface area has a local minimum at the point r, ; this is, moreover, 
the absolute minimum. The height A, of the cylinder is given by Ay = V/(ar?) = 2r,. 

If one substitutes for Va given value, say V = 50 cubic inches, one obtains r,; 2 and h, = 4”. 
Of all cylindrical tins with the same volume, the one whose surface area is the smallest is that 
whose diameter 2r, is equal to its height /,. 

Example 3: A beam of rectangular cross section is to be cut from a log, whose cross section 
may be assumed to be circular with diameter d (Fig.). For what measurements does its load 
carrying capacity T reach a maximum, if T is proportional to the breadth 6 and the square of the 
height A; T = cbh? (c = const)? — The theorem of Pythagoras yields 
the additional condition 4* = d? — b?, and hence one obtains 7 as 
a function of one variable T= f(b) = cd*b — cb*. Thus, f(b) 
= cd? — 3ch*, f’'(b) = —6cb. A local extremum can occur only for 
f'(b,) = 0 = cd? — 3cb?, that is, 5b, = (d/3) V3. Because f’’(b,) 
= —2ced 3 < 0, this extremum is a maximum of the load carrying 
capacity. Finally from h* = d? — b?, h = (d/3) 6. The ratio h/b 
= )/2/1 is independent of the diameter of the log. 


Denoting the sides of a rectangle by a and 4, its perimeter by P and its 
area by A, then the following table shows the validity of two theorems: 


Of all rectangles with given perimeter the square has the greatest area. 19:46 Section ofa beam 
Of all rectangles with given area the square has the smallest perimeter. cut from a log 


P =2Aa+ 5b) b= P/J—a 


A = ab = f(a) P= a+ 6) =f(a) 
f(a,) = P/2 — 2a, =0 f'(a,) = 2 — 2A/a? = 0 
f'(ai)=—2<0 | maximum —_|{ f”(a,) = +4/VA > 0 


square 


b= Ala 


Required 


1st derivative 


2nd derivative minimum 


Solution square 
Example 4: A sector is to be cut out of a 
circular piece of sheet metal of radius R, and 
the remainder bent together to form a conical 
funnel (Fig.). For what angle « at the centre 
does the funnel have the greatest capacity? — 
The formula V = (2/3) r7h for the volume 
of a cone, together with the additional con- 
dition r? = R*—h?*, gives the equation 
V=f(h)= (7/3) (R?h — h*). For extrema, 
f'(hy) = a(R* — 3hj)/3 = 0; Ay = (R/3) V3, 
fh) = —2ah, f(y) = —Q/3) aR V3 <0. 
Thus, A, gives a maximum. From the addi- 
tional condition, r; = (R/3) 6. In bending 
19.4-7 Funnel made from a circular sector the sheet the circular arc of length b= éR 


428 19. Differential calculus 


becomes the circumference 2mr of the circular base. From Ré=2zr, it follows that 
é = (2/3) x 6, or e = 294°, 


Example 5; Of all cylinders that can be inscribed in a right circular cone of radius R and height 
H, the one with the greatest volume is required (Fig.). | rf 

The volume V of the cylinder is given by V = 2r7h. The additional condition comes in this 
case from the intercept theorem: A/(R — r) = A/R, h= (H/R)(R—r). Hence V = f(r) 
= m(H/R) (Rr? — r*); f(r,) = 2(H/R) (2Rr;, — 3r?) = 0 gives r; = 2R/3, because the solution 
r2 = 0 corresponds to the volume V = 0. Because f”(r,) = 2(H/R) (2R — 6r,) = —2nH < 0, 
the volume is a maximum for r = ry. 


19.4-8 Cylinder inscribed in ails 
a right circular cone 19.4-9 Snell’s law of refraction 


The following physical problem leads to Snell’s law of refraction. A plane E, is the common 
boundary of two media MI and MII, in which the velocities of propagation of a body or of a 
process are different, v, in MI and v2 in MII. Under what conditions is the time required for the 
motion from the point A; in MI to A, in MII the smallest possible (Fig.)? — 

It is clear that this motion takes place in a plane E> passing through A, and A, and perpendicular 
to E,. If L, and L2, respectively, are the feet of the perpendiculars |A,L,| = a, and |A2L,| = a, 
from A, and A; to the line of intersection of the two planes E, and E,, and if |Z,L2| = 5, then the 
position of the points A; and A, is fixed: If the path of the motion cuts the boundary line at P, 
where |L,P| = x, then the length of the path s, from A, to P is given by s; = V(a? + x?), and 
|PA2| = s. = V{az + (b — x)?}. The time ¢ required to describe the whole path A, PA, is the sum 
of the individual times t, = s,/v, and t, = s2/v2. This gives the function t(x), and the condition 
for the extremum is obtained from this: 


t= t(x) = ty + te = (1/01) + Va? + x?) + (1/v2) + {a3 + (6 — x)*}, 
t'(x) = (1/v1) x/V(az + x?) — (1/v2) (b — x)/V{a3 + (6 — x)?} = 0 = x/(v45,) — (6 — x)/(v252). 


This means geometrically that sin ¢,/sin €2 = v,/v,, because x/s; = sin e, and (6 — x)/s2 = sine>. 


Example 6: What maximum speed may an express train have if braking produces a uniform 
retardation of b = 2.2 ft/s*, and the braking distance may not exceed s, = 3000 ft? — 

Substituting the braking distance s, into the distance-time equation s = vt — (6/2) t? of uniformly 
retarded motion gives the braking time f, as a function of the maximum speed v. The speed 
s’ = v — bt must be zero after the braking time t,. In both formulae v denotes the speed when 
the brakes are first applied, and hence the maximum speed. Elimination of t, from the equations 
3000 = otf — 1.1t? and 0 = v — 2.21, gives v = /(2bs,) = 10- 132 = 115 ft/s. Consequently 
the speed of the express train may not exceed v 78 miles per hour. 


Example 7: A water main is to be ae A 
laid from a water tower W to the main . sila by jit ea | 
buildings H (Fig.). In addition, some aaa ws 
buildings S at some distance from the 7 STi 


main are to be supplied with water by Ae 
means of a subsidiary main. S is at a x miles 4-x)miles 2miles 
distance of 1 mile from the principal 

main and the foot of the perpendicular 6 miles 
from S to the main is at a distance of 19.4-10 Sketch for the laying of a water main 


19.4. Extreme values of functions 429 


2 miles from the buildings H. The distance from A to the water tower is 6 miles. The cost of one 
mile of water main is estimated as follows: principal main 30 units, reduced load main 22 units, 
subsidiary main 12 units. All mains are laid in straight lines. At what distance from the water 
tower must the subsidiary main branch off from the principal main, in order that the cost of laying 
them shall be as small as possible? — 

Introducing the variable x for the distance between the water tower W and the branch point A, 
the length ing of the subsidiary main is given by |AS| = )/{1 + (4 — x)*}. The total cost C is then 
made up as follows: C = 30x + 22(6 — x) + 12 {1 + (4 — x)*}. An extreme value for the 
function C = f(x) = 132 + 8x + 12 y(17 — 8x + x*) must now be calculated. f’(x) = 8 
+- (12x — 48)/\/(17 — 8x + x*); hence the necessary condition f‘(x) = 0 leads to the quadratic 
equation x* — 8x + 76/5 = 0 with the solutions x; » = 4 + (2/5) 5. The value x; = 4-+- (2/5) y/5 
does not satisfy the equation in the form containing the square root (check). The second derivative 
f"(x) = 12/(17 — 8x + x*)*/ is greater than zero for x, and hence indicates a local minimum 
that is at the same time an absolute minimum. The subsidiary main must therefore branch the 
principal main at a distance of 3.11 miles from the water tower. 


Extreme values of functions of several variables 


The k variables €;, &2, ..., &;, -.-, &, of the function y = f(&,, ..., &,) can be regarded as an ordered 
k-tuple x = (€,,&,.-.,&,) of real numbers in a k-dimensional Euclidian space (see Chapter 40.). 
The element x lies in a neighbourhood of the element x,, = (&{”,...,&@™) if positive num- 
bers fh; can be found such that each of the variables ¢, (i = 1, 2,...,k) lies in an interval 
Em — hy << E, < Ef™ + h,. Each k-tuple x corresponds uniquely to a function value f(x) = y, 
and a local maximum of the function f(x) = y at the point x,, can occur only if the ordinate values 
f(x) for all elements different from x,, in a neighbourhood of x,, are less than f(x,,). Similarly 
f(x) > f(Xm) must hold in a neighbourhood of a local minimum. 


Necessary condition for the occurrence of local extrema. For functions of two variables one puts 
&; = x, 2 = y and f(x, y) = z. The graph of the function is a surface in three-dimensional space, 
and the conditions for a local extremum have an intuitive meaning: f(x) < f(x,,) means that in a 
neighbourhood of the maximum P,,,, = (Xmax>max> Zmax) all other points of the surface lie below 
a horizontal plane through this point. Similarly f(x) > f(x,) means that at the point P,,;, = 
(Xmin> Ymin> Zmin) all points of the surface in a neighbourhood lie above a horizontal plane. These 


. of (x, 
planes are tangent planes and are spanned by the two tangents determined by z, = fe, ¥) and 


Of (x, y) ; Ox 
Zy = oy. (see 19.3. ~ Partial derivatives of a function), which are parallel to the x, y-plane only 


if z, = 0 and z, = 0. 


This condition is necessary; a saddle point (Fig.) shows, however, that it is not sufficient. Although 
both tangents are horizontal there, no matter how small a neighbourhood of the saddle point is 
chosen, two points of the surface can always 
be found within it and on opposite sides of 
the tangent plane at the point. 

The necessary condition already found 
can be generalized to differentiable functions 
of k variables. 


A function ¥ — Fe; , E>, si | é,) = f(x) 
can have a local extremum at the point 
x =-x,, only if each partial derivative of 
the first order vanishes at Xm. 


Sufficient condition for the occurrence 
of a local extremum. If one expands the 
function z = f(x, y) in the neighbourhood 
Xm — Ay <x < Xm + hy, Ym — ha << y 
<¥m + hz of a local extreme value at the 
point (x,,,¥m) by Taylor’s theorem and ; 
breaks off the expansion after the termn=1, 19.4-11 Saddle point at the point S 


430 19. Differential calculus 


then because f, = 0 and f,=0, one obtains: 2!4 = 2![f(%m + Ay, ¥m + h2) — f(%m, Ym))] = 
h2fix(%m + 8141, ¥m + O2h2) + 2hyhofry(Xm + Oh, , Ym + B2h2) + ho fyy(%m + Ohi, Ym + O2h2), 
0< #,,82 < 1. If the second derivatives f,,, f,, and f,, are continuous functions, then they have 
the same sign at the point (x, + 41, ¥m + Az) as at (Xm, Ym), provided that A, and hz are chosen 
sufficiently small. In particular, if f,, == 0, then the difference 4 can be expressed in the form: 


2! A= hi fix = 2hyhofxy at hifyy — (1/frx) [Ai fx ar ho fy)” - h3(fexhyy _ Sale 


In the square bracket, besides the square there occurs only the expression (f,, fy, — f?2,). If it is 
positive, then the square bracket is positive; 4 is different from zero and has the same sign as f,,. 
Thus, for f,, <0, the ordinate difference 4 is always negative in a neighbourhood of the point 
(Xm> Ym) and the function has a local maximum; for f,,, > 0 however, it has a local minimum. From 
(fxxfyy —f 2,) > 0 it also follows that f,, has the same sign as f,,, in the neighbourhood considered. 


f(x, y) has a local extremum at (Xm, Ym) if fe = 0, fy = 0 and (fxs fy — £2) > 0 at this point; 
for f.. <0 it is a maximum, and for f,,. > 0 a minimum. 


It can be shown that no extremum can occur for (f/,,f,, —f2,) < 0; for example, f,, and f,, 
have different signs at a saddle point. However, when (f,,f,y — £3,) = 0, no conclusion can be 
reached as to whether an extremum occurs or not. 


When the first and second partial derivatives of a function y = f(x) are denoted by p; = E 
0“y i 
(= 1,2,...,k) and = ———— (J, j= 1,2,...,k), then the function 
(i ) Pij OE, oF; G, J ) | Pit Prz--> Pir 
y = f(x) has a local extremum at the point x,, if all minors of even order Par Baa Pa 
of the determinant displayed here are positive and the signs of the minors of ; . ah 
odd order agree with the sign of p,,; for p;; < 0 the local extremum is a o Paz Pr, 
kl = Pe2 --- Pex | 


maximum, and for p;; > 0a minimum. 


Maxima and minima with side conditions. In many problems of applications extrema of functions 
of several variables are to be determined in which the variables are not independent of one another, 
but are connected by side conditions. Such problems can often be solved by reducing the number 
of variables by elimination with the help of the side conditions. Some extreme value problems 
were solved above by this procedure. This means, however, that preference, not always justifiable, 
is given to one variable over the others. A treatment giving equal weight to all variables, the Lagrange 
method of undetermined multipliers, will be derived intuitively in the following for functions of two 
variables. 

The two variables x and y of a function 
z = f(x, y) are connected by means of a side con- 
dition g(x, y) = 0. The function z = f(x, y) repre- 
sents a surface in space, and the equation giving 
the side condition g(x, y) = 0 defines a curve K’ 
in the x, y-plane (Fig.). In calculating extrema 
the only values x, y of the function z = f(x, y) 
that are of interest are those that satisfy the side 
condition, that is, the points of that curve K on 
the surface z= f(x, y) whose projection on the 
x,y-plane is precisely the curve K’. Thus, the 
problem of determining the local extrema of the 
function z = f(x, y), taking account of the side 
condition g(x, y) = 0, means finding the local 
extrema of the space curve K. For this purpose 
one considers the family of curves c = f(x, y) 
with c = const. One of these contour lines AH will 
touch the space curve K at a point E; E is an 
extremum. The projection H’ of this contour line 
H on the x, y-plane touches the curve K’ at the 
point £’. The functions defined by 9(x, y)=0 
and f(x, y) — c = 0 must therefore have the same 
derivative at the point E’. Implicit differentiation gives f,/f, = 9./y,. From this it follows on the 
one hand, that f, and g,, and on the other hand, that f, and y, are proportional. Introducing the 
constant of proportionality (—A) — the Lagrange multiplier — one obtains the two equations 
f, = —Ag,, and f, = —Ag,, or f, + Ap, = 0 and f, + Ap, = 0. But the left-hand sides of these 
equations represent the partial derivatives of the function F(x, y) = f(x, y) + Ag(x, y). All this 
leads to the method of Lagrangian multipliers: 


19.4-12 Local extremum with side conditions 


19.5. Applications to plane curves 431 


To determine the local extrema of a function z = f(x, y), subject to the side condition g(x, y) = 0, 
an auxiliary function F(x, y) = f(x, y) + Ap(x, y) with the undetermined multiplier A is formed, 
and the first partial derivatives of this function are found. From the system of equations F, =f, + Ap, 
mh. - 7 f, + Ap, = 0, p(x, y) = 0 the coordinates of possible extrema and the multiplier A are 
calculated. 


This rule simply gives an elegant way of calculating those points at which local extrema can occur. 
The investigation to determine whether an extremum really occurs, and of what nature, is in general 
complicated. In a concrete example it is often clear from the formulation of the problem whether 
a maximum or a minimum is to be expected with certainty. 


Example: Among all right-angled triangles with given hypotenuse c it is required to find the 
one with the greatest area. Denoting the sides containing the right angle by x and y, A = f(x, y) 
= xy/2 is to be a maximum. Since the triangle is right-angled, the side condition x? + y* = c? 
or (x,y) = x*+ y?— c*?=0 holds. Thus, one forms the auxiliary function F(x, y) 
= xy/2 + A(x? + y»? — c?). From the system of equations 3 


St = 912+ 2Ax = 0O— A = —y/(4x), 5 = 412+ 2dy = O— x? = y?, 


(x, y) = x7 + y*? — c?# = 0 x? = c?/2 


one obtains x = » = c/|/2. Hence the isosceles right-angled triangle is a possible solution, and it 
can be proved that for this the area is an absolute maximum. 


Similarly the method of undetermined multipliers can be extended to the determination of extrema 
of a function of n variables with side conditions. 


19.5. Applications to plane curves 


By means of the differential calculus the important points on the curve given by a function can 
be determined; in particular, it can be decided when singular points occur. The properties of the 
evolute and the involute arise out of the clarification of the concept of curvature. 


Discussion of the curve defined by an explicit function 


It is required to investigate the function y = f(x) with the domain of definition D(f). If D(f) is 
not given explicitly, then it is taken to consist of all real numbers x for which the analytical expression 
f(x) is defined. If the domain of definition extends on one or on both sides to infinity, then the behav- 
iour of the function for large values of |x| can be investigated by means of limiting processes, that 
is, for x ~ ++-oo or for x-» —oo. One therefore speaks also of the behaviour of the function at in- 
finity. For rational functions this was done in Chapter 5. The coordinates of the points of inter- 
section with the coordinate axes are obtained from the equation y = f(x) by putting x = 0 or y = 0, 
respectively. For the determination from the equation f(x) = 0 of the zeros of the function, that is, 
the values of the abscissae of the intersections of its curve with the x-axis, approximation methods 
are applied if necessary. For rational functions Sturm’s theorem (see Chapter 5.2. — Zeros) can be 
used to find intervals in which a zero lies. 

The behaviour of the function in the neighbourhood of its points of discontinuity is examined 
with the help of limiting arguments and the type of each discontinuity, for example, pole, indeter- 
minate point, jump discontinuity or oscillation, is determined. 

Finally, the function is tested by the methods of the differential calculus for extrema and points 
of inflection. If the zeros of the first and second derivatives of the function are determined for this 
purpose, then for twice continuously differentiable functions one also knows the intervals in which 
the first derivative has a constant sign and the intervals in which the second derivative does not 
change sign. Hence one has found the intervals in which the function is monotonic decreasing, 
monotonic increasing, convex or concave. 


Examples of curves for discussion. In what follows the results are given of the discussion of the 
curves of typical examples. In particular: yo; are the ordinates of the points of intersection with the 
y-axis; Xo; the zeros; Xm; the abscissae of extrema; x,, the abscissae of points of inflection; M; a 
maximum; m, a minimum; W, a point of inflection. 


Example I: The function y = f(x) = (x/300) (x? — 45) (x? — 10) is defined for all x (Fig.). 


432 19. Differential calculus 


Its derivatives are 
y’ = f(x) = (1/60) (x? — 30) (x?— 3); yy” = f(x) = (4/30) (2x? — 33); 
ye’ = f(x) = x?/5 — 11/10. 

Behaviour at infinity: f(x) + +c°, as x + +o, Intersections with the axes: 
yo = 0; *o1 => 0; *o2 = —3 V5 —6.71; *o3 = +3 5% +6.71; 
Xo4 = +y10% 3.16; *Xos = —V1l0w —3.16. 

Local extrema: 
f'(Xm) = 9 Xm1 = —V30;  Xm2=—V3; Xmas = +33 Xme = +130; 
M, = (—5.48, 5.48); M,= (1.73, 1.7); 9 m,= (—1.73, —1.7); 
m, = (5.48, —5.48). 

Points of inflection: 
Po = 0— xy. = —4.06; x2=9, x3 = +4.06; 
W, = (—4.06, 2.58); W,=(0,0); W;= (4.06, —2.58). 


| i q ' ] | ‘ ' 
| M 4 | 4 , T? 
Se ——4+ 4 eo a = ae oie Stee 
4 = on - . / e 4 + a a fa 
; } 4 at ' 
| ito] 1 
[ 1 , 1 4a] «| 
, 5 
H a 
Hit agit et | é 
a ] 1 
== i 7 2 
I 1 
| rT { 
t 
rt # 


19.5-1 Graph of the function 
y = (x/300) (x* — 45) (x* — 10) 


19.5-2 Graph of the function y = x3/(x* — 1) ue | ae 1. : 


Example 2: The function y = f(x) = x°/(x? — 1) (Fig.) is defined for all x with the exception 
of the values x = +1, e which the denominator is zero. Its derivatives are 


ae yy ix — a <5, 2x(x* ++ 3) . _ eer —6(x* + 6x? + 1) 
¥ =f (x) a7 es = ~ Sa) a =f (x) iTimad (x2 == 1)3 r y” ‘=f (x)= (x? = | . 
ee x x | 3 
Behaviour at infinity: cs 5 =rxy+ 5s Gial +oo, aS X— +o, 


Asymptote y = x (see Chapter 5.). 

Intersections with the axes: yo = 0; x9 = 0. 

Discontinuities: Poles for x; = 1 and x, = —1 with vertical asymptotes. 

Local extrema: f’(x,,) = 0 —+ %m1 = —V3; Xm2 = +V3i Xm3 = O = x,y; M= (—1.73, —2.6); 
m = (1.73, 2.6). 

Points of inflection: f’’(x,,) = 0+ x = 0, W= (0,0). Since f’(x,,) = 0, this is a horizontal 
point of inflection. 

Example 3: The function y = f(x) = x (9 — x?) is defined only in the interval —3 < x < 43 
(Fig.). Its Oe ges are 


— 2x? eo opteg Oy x(2x? — 27) 2 ee —243 
naa a= ef @ =e 7 TT Oe a 


Intersections with the axes: ¥o = 0; x9; = 0; Xo2 = 3; Xo3 
Local extrema: /{"(x_) = 0+ x. = —(3/2) V2; x2 = 6p) V2; m= (—2.12, —4.5); 
M = (2.12, 4.5). 


19.5. Application to plane curves 433 


Points of inflection: f’’(x,) = 0— x,, =0; x,2 = +(3/2) 6; x,3 = —(3/2) V6; x,2 and 
Xy3 are outside the domain of definition. W’, = (0, 0). 

The mirror image of this curve in the x-axis is the curve defined by the function y = —x (9 — x?). 
Both functions are defined by the algebraic equation x* — 9x* + y* = 0. P(0,0), as a double 
point, is a singular point. 

Example 4: The function y = f(x) = sin 2x 
+ 2cos x is defined for all x; because of its peri- 
odicity the investigation will be restricted to the inter- 
val 0 = x < 2x (Fig.). Its derivatives are 

y =f'(x) = 200s 2x — 2 sin x; 
yp’? =f"(x) = —4sin 2x — 2 cos x; 
ye’ = f(x) = —8 cos 2x + 2sin x. 


ee eee 
eo 4 1 q 
j | 
| | ' 


19.5-3 Discussion of the function defined by 19.5-4 Graph of the function 
the equation x* — 9x* + »* = 0 y = sin 2x + 2cosx 


Intersections with the axes: yo = 2; x9, = 2/2 © 1.57; Xo2 = 3n/2 = 4.71. 

Local extrema: f’(x%_) = 0— Xm, = 7/6; Xm2 = 52/6; Xm3 = 32/2 = Xy3. M= (0.52, 2.6); 
m = (2.62, —2.6). 

Points of inflection: f"(x,) =O—> xy, = 2/2; Xyw2 3.39; Xs = 37/25 Xwa FY 6.03; 
W, = (1.57, 0); W2= (3.39, —1.45); W3 = (4.71,0), a horizontal point of inflection since 
S'(X%wa) = 0; Wg = (6.03, 1.45). 


Singular points 
A survey of singular points can be made by investigating the possible tangent directions at these 
points. If the equation of the curve is given in implicit form by f(x, y) = 0, and if @ is the angle 
between a tangent and the +x-axis, then 


A tangent parallel to the x-axis is then characterized by f, = 0, and one perpendicular to the X-axis 
(yp = 2/2) by f, = 0. If both partial derivatives have the value zero, however, then a singular point 


is to be expected. 


Singular points of an algebraic curve. In a neighbourhood x,—hyj<x<x,+h, 
y, —h2< y<y, t+ hz Of a singular point (x,,y;), which is supposed to be a zero, because 
f(X,, Ys) = 0, f(X5, Ys) = 0 and f,(x,, ys) = 0, the Taylor expansion with the remainder term R3 
has the form: 

Fx, + Aas Vs + Aa) = */2ht fe, + Whihohey + hofyy) + Rs. 
If h, = 4x and h, = Ay tend to zero, then (R3/h7) is likewise a null sequence, because Rs contains 
only the third and higher powers of A, and hz. 


The value of y’ = tan ycan then be determined from the quadratic equation f,,, + 2y’fiy + ¥*fyy =0 
if the three partial derivatives do not all vanish. If f,, = 0, then the number of solutions of this 
quadratic equation depends on the value of 4 = f2, — fxxfyy. For 4 > 0 two distinct tangents 
exist; the curve has a double point in which two branches of the curve intersect. For A = 0 two 
coincident tangents exist; two branches of the curve have a common tangent and touch one another 
either at a tacnode or at a cusp. At an ordinary cusp the two branches of the curve lie on opposite 


434 19. Differential calculus 


sides of the tangent, and at a ramphoid cusp 
they lie on the same side. For 4 < 0 there 
are no real tangents and the curve has an 
isolated point there (Fig.). If f,, = 0, then one 
tangent has the slope y’ = —(1/2)f,./fyy, 
whilst a second tangent is parallel to the 
y-axis, as similar considerations show for 
dx 
dy 
Occasionally several singularities occur in 
combination. A triple point is a point at 
which there are three tangents. 


ee This point too is a double point. 


19.5-5 Singular points of algebraic curves; 
a) double point, b) tacnode, c) triple point, 
d) ordinary cusp, e) ramphoid cusp, 

f) isolated point 


Example of a double point, folium of Descartes: From the equation f(x, y)=x* + y? —3axy=0 
the partial derivatives fy = 3x? — 3ay, % = 3y? — 3ax and Fos = 6x, fe = —3Ja, Yee = 6y are 
obtained. The equations f, = 0 and f, = 0 have two solutions x, = 0, »; = 0 and x, =a, 
yz = a. Only (x,, y;) satisfies the equation f = 0. 

For x; = 0, »; = 0: f,, = 0, fry = —3a, fyy = 9, and thus 4 = 9a* > 0. The singular point 
(x,, 91) is a double point. Because f,, = 0 the tangent directions are given by y’ = oo and 
y’ = —(1/2) fex/fyy = 0. The tangents coincide with the coordinate axes. The derivative has the 


value % = —f./f, = —(x? — ay)/(»? — ax), 


For f, = 0 but fy +90 there is a tangent f, 
parallel to the x-axis; for f, = 0 but f, + 0 there 
is a tangent ¢, parallel to the y-axis. The calcu- 
lation gives: f, = 0, f, + 0: y? = ax» y®/a® 
+y— 3y° =O of p® = 2y70°: with y+0 
ys =—ay2~ 1.264 and x3 =—a//4 1.59, 
4 
Similarly for /,, = 0, fy#0, x, =ay2 


ew 1.26a and ys = ay/4 & 1.59a. 
Substituting y = mx in the equation f(x, y) 
= 0 one obtains x°(1 + m3) — 3amx? = 0 or 
1 + m? — 3am/x = 0. As x — +00 this gives 
1 + m?=0 or m= —1. The folium of Des- 
cartes (Fig.) therefore has an asymptote with 
slope m = —1. If its equation is y= —x+e¢, 
the value of c is obtained by substituting for 
y in the equation f(x, »y) = 0: 
x? + (ce — x) — 3ax(e— x) = 0 or 3x7(a + c) 
— 3x(c? + ac)? + c? = 0. Hence, letting 
x— +oo, a+ec=0 ore = —a. The equation 19,5-6 Folium of Descartes 
of the asymptote is y = —x — a. 


Curvature, evolute and involute 


Curvature. In the discussion of the point of inflection the concept of curvature was used to charac- 
terize the different course of the curve of the function y = f(x) before and after a point of inflection. 
It was established there that the same property of the curve is also characterized by the variation 
of the direction angle t of two successive tangents to the curve. Because it is clear that the larger 
the increment of arc length 4s required for a given variation At of the direction angle, the smaller 
the curvature x, the latter is defined to be the rate of change of this angle as a function of the arc 
length s: 


19.5. Application to plane curves 435 


19.5-7 Curvature 
of a plane curve 
with small 
left-curvature 


19.5-8 Curvature 
of a plane curve 
with large 
right-curvature 


(Fig.). But the angle t is defined by t = arctan y’; by the chain rule one obtains 


dt dr / ds 

— = — / — = yl 2)- T1/VA 2), = y/1 fo dl Glee 

aa ae v7] + y*): T1/V0 + y’*)) =» +’) m 
If x and y are regarded as functions of a parameter and differentiation with respect to this parameter 
is denoted by a dot, then t = arctan (/X), x = (x) — pX)/(X? + p7)3/2. The curvature can also be 
expressed in polar coordinates x = rcos #, y = rsin ®. One obtains 


Xx=cos#—rsindd; yp=sinds +rcosdd 
and 
X¥ =Foos? — 2 sin’ ds — rcosdd? — rsinds, 


jy =Fsin’ + 2-cos0S — rsind} + rcosds 
and hence 

x = (2729 + rh — Fb + 2G)? + ?G2)3/2 = (2r’? — rr’ + r)/(r’2 + ?)3/2, 
where 


From the first of these formulae it follows that the curvature has the same sign as y”’. Thus, if 
the derivative increases as the curve is described in the sense of increasing abscissa, then the second 
derivative y’’ is positive and so x is also positive; regarded from a point P with a very large ordinate, 
the curve appears concave. A road that is curved in this way has a left-curve. If the curve appears 
convex from P, or if a similarly curved road has a right-curve, then the derivative constantly decreases, 
and y” and are negative. 


Circle of curvature. The circle is a curve with constant curvature; from the parametric represen- 
tation x = @ cos ft, y = go sin ¢t one obtains x = 1/0. This result corresponds to the intuitive property 
that the smaller the radius of curvature 0, the greater the curvature of a circle (Fig.). To every point 
(x, y) of the curve of the function y = f(x) a circle of curvature y = g(x) is assigned in accordance 
with the following rules: both curves pass through the same point, f(x) = g(x), havethe same tangent 
there, f’(x) = g’(x), and the same curvature, so that f’’"(x) = g’’(x). The tangent ¢ at the point (x, y) is 


19.5-9 Circle of 
curvature K, radius of 
curvature e and centre 
of curvature M; 

@1 > 0, Qs < 0 


436 19. Differential calculus 


orientated. If it makes an angle « with the + x-axis and an angle 8B = 7/2 — « with the +-y-axis, then 
its direction cosines are cos « = x/\/(x? + »”) and cos B = sin « = »///(X? + 7). The positive direc- 
tion of the normal n is the one resulting from a rotation of -++7/2 from the tangent. Because 
& =a-+ 2/2, its direction cosines are cos & = —)y//(x? + ¥?) and cos Pf = x/y(x? + y?). This 
orientation is so chosen that for positive curvature the positive normal points inwards, towards the 
centre of curvature, and for negative x it points outwards, away from the centre of curvature. The 
radius of curvature @ = 1/x is then marked off on the normal according to its sign. The coordinates 
&, 7 of the centre of curvature are 


&€=x+ecosé = x — oy/V(X? + ¥), 
n= y+ ocosB = y + ox//(%? + 5) 
(Fig.). The equation for € and 7 are obtained by substituting for 9 = 1/x. 


19.5-10 Direction 
cosines of the tangent r 
and the normal a 


cafenary 


SZ 


KIS 
Q 

Catenary. A completely flexible heavy thread, suspended from two points, assumes in equilibrium 
the form of the catenary. It is the evolute of the tractrix and is the graph of the function 
y = (a/2) (e*/* + e-*/*) = a cosh (x/a) with the derivatives y’ = sinh (x/a) and y” = (1/a) cosh (x/a). 

ause 1 ++ sinh? (x/a) = cosh? (x/a), substitution gives the radius of curvature @ = a cosh? (x/a) 
= y?/a and the coordinates of the centre of curvature & = x — asinh(x/a) = x — ay’ and 
” = 2a cosh (x/a) = 2y (Fig.). 

Construction of the tangent t and the centre of curvature. The circle on the ordinate |PQ| as diameter 
cuts the circle with centre Q and radius a in the point R which lies on the tangent at P, because 
the angle t between the line through P and R and the x-axis is such that tant = |RP|/|RQ| 
= V(y? — a*)/a = sinh (x/a) = y’. The perpendicular to ¢ at P is the normal n, which cuts the 
x-axis in S. From cos t = (a/y) = y/|PS| it follows that |PS| = y?/a = ||. According to the sign 
of the curvature, |o| is marked out on the positive or the negative side of the normal from P. Because 


x 


19.5-11 Centre of curvature of the catenary 


— (a? sinh (x/a)) = a cosh (x/a), A = a? sinh (x/a) gives the area between the x-axis and the 
catenary. The length of arc / measured from the lowest point (0, a) is / = a sinh (x/a). 


Evolute. If the function y = f(x) has continuous derivatives of the first and second order, then 
the coordinates &, 7 of the centre of curvature are continuous functions. The curve defined by them 
is called the evolute. 


The evolute of a plane curve is the locus of its centres of curvature. 


19.5. Applications to plane curves 437 


The evolute can also be constructed as the envelope of the normals; thus, the normals to the 
original curve are tangents to its evolute (Fig.). Because the centres of curvature of the original 
curve lie on the evolute, the formulae for the coordinates of the centre of curvature give at the same 
time a parametric representation for the evolute. One need only regard & and 7 as running coordinates. 


original curve 


érvoluse 


19.5-13 Family 
of involutes of a 
plane curve 


original curve 


19.5-12 Evolute A 

Involute. The involute is a developed curve. One imagines a curve with an inextensible thread 
laid along it (Fig.). The thread is attached to a point A on the curve. If one then considers a point 
B,; of the thread and unwinds the tautly held thread from the curve, then the point B, describes a 
new curve, an involute of the original curve. Since each point B describes such an involute, a whole 
family of involutes belongs to one given curve. Because the thread is always held taut during the 
unwinding, the unwound portion of it is always a tangent to the original curve. The point B describes 
about the instantaneous point of the tangent an infinitesimal circular arc as element of arc of the 
involute; but this means that the unwound portion of the thread is always normal to the involute. 
Thus, the tangents of the original curve cut the involute at a right angle. From this the following 
theorems arise: 


The involutes of a plane curve are the orthogonal trajectories (curves cutting at right angles) 
of the tangents to the original curve. 

Every curve is the evolute of each of its involutes. 

Every curve is an involute of its evolute. 


Example: In machine design the involute of the circle finds application as the profile curve of 
the teeth of involute gears (Fig.). From the diagram the coordinates of a point P of the involute 
can be read off as § = x + ssin¢ and 7 = y — scos¢f. From the parametric representation of 
the circle x = rcos tf; y = rsin ¢ together with the formula s = rf for the length of the unwound 
circular arc, the parametric representation § = r(cos t + fsin?’); 7 = r(sint — fcosf) for the 
involute of the circle is obtained. 


rd ‘| ‘“ ‘ 
/ an of a \ 
/ , Aix \ 
' the circle -_ 


19.5-14 Involute of the circle 


Special curves 


In the discussion of double points and the centre of curvature in the previous section, the most 
important properties of the folium of Descartes and of the cafenary were stated incidentally. Proper- 
ties of certain other curves are given in the following. 


438 19. Differential calculus 


Cassinian ovals. Cassinian ovals are defined as the loci of all points P for which the product of 
the distances r, = |F,P| and r2 = |F,P| from two fixed points F; and F, has a constant value a?. 
If the two points F, and F, lie on the x-axis of a Cartesian coordinate system at, distances +e and 
—e from the origin, then rz? = (x — e)? + y*; r2 = (x 7: e)? 7 y?; r2? rz = at or (x? ae aes 

— 2¢7(x? — y?) = at — e+, r* — 2e?r? cos 26 = at — e+, r? = e? cos 28 + V(e* cos? ae —e*), 


19.5-15 Cassinian ovals 
for e = 6, a = 10, 7, 6 
and 4; e = a = 6 is the 
lemniscate 


Cassinian ovals of different form are obtained according to the ratio of the two constants a and 
e (Fig.). In the following survey they are characterized by the intersections S,, S2, S53, S4 with the 


x-axis, N,, N2 with the y-axis, by the extrema E,, E,, E3, E4 and by the points of inflection W,, 
W. 2°) W3 ) W, 4. 


lLa>e V2, the curve resembles an ellipse. Si, S2=(4 V(a? et e?), 0); Ni, N2 
= (0, + (a? — e?)). For a = e /2 also the curve has a form like that of the ellipse, S,, S2 
= (+e /3, 0); Ni, No = (0, +e), but at N, and N,2 the curvature is zero. 
2.e<a<e V2, indented oval. 5%, 5, = (+y(a? + e?), 0); Ni, Nz = (0, +a? — e)): 
E,, E3, £3, Ea = (4(1/2e) V(4e* — a‘), + (a*/2e)); Wi, W2, Ws, We 
= (LV{(v — u)/2}, £V{(u + v)/2}), where u = (a* — e*)/(3e?) and v = y{(a* — e*)/3}. 
3.4 < ¢, two separated ovals. sv”, sy = (+y(a? + €”), 0); S¥, SZ = (+(e? — 47), 0); 
Ey’, Ey’, E3’, Eg’ = (4(1/2e) y(4e* — a*), +(a7/2e)). 


4.a =e, lemniscate. for the equation of the lemniscate one obtains (x? + y?)? — 2a?(x? — y?) 


= 0 or r? = 2a’ cos 28, r = a Y(2 cos 28). Thus, a parametric representation is given by 
x =acos# (2 cos 28), y = asin # (2 cos 28). 


From < = —2a: sin 36//(2 cos 28) and & = = 2a- cos 38//(2 cos 28), it follows that 


ser Fa = —cot 3%. Thus, extreme values can occur for 30 = 2/2, 32/2, 5n/2 or 0 = 2/6, 2/2, 


5n/6. The extreme values are x;,2 = +(a/2) V3, ¥1,2 = ta/2, r1,2 = a. The point (0, 5S) is a double 
point. The values of the partial derivatives at (0, 0) are f, = 4x3 + xy? — a*x) =0; 
f= A(x? yay oo 2y) = 03 fyx = 4(3x? + y? — a?) = —4a?; fy = 8xy = 0; fyy = 402 + 3y? + a?) 
= 4q7;4 = +16 

From = + y’2q? = 0 it follows that y’ = +1, so that y = +x are the tangents at the point 
(0, 0). The intersections with the x-axis are S?, S? = (ta V2, 0). The area of one loop is given by 

+n/4 
+n/4 
A="), | 12(8) 40 = a [00s 28.49 = 5 sin 20 =a’, 


2 ss 
aia 2/4 


19.5. Application to plane curves 439 


19.5-16 Derivation of the equation of a cycloid 


Cycloids. A cycloid arises mechanically as 
the locus of a point P rigidly connected to 
a circle of radius r at a distance a from its 
centre M, if this circle rolls on a straight line 
without slipping. If one lets the circle roll on 
the x-axis of a Cartesian coordinate system, in 
which the abscissae are measured from the 
position in which P is at its lowest point, and 
denotes the angle of rotation by y, then the a 
unrolled arc |OB| = rp is longer than the x- 2rcr 
coordinate of P by an amount asing andr 19.5-17 Contracted, extended and common cycloids 
is longer than its ordinate by a cos (Fig.): 

x =ry —asing; y=r—acos@. According to the ratio a/r one distinguishes the contracted 
(a <r), the extended (a > r) and the common cycloid (a = r) (Fig.). 

The contracted cycloid has minima for g = 0, 27, 4, ... with ¥m_ = r— a; the extended cycloid 
has for the corresponding values of x two points with the same abscissae. One is the minimum 
with y, = r—a. The y-value of the other is given by the trigonometric equation rg = a sin 9. 
The common cycloid has cusps at these places. hence of arc is ds = 2r sin gy/2 dg, and thus the 


length s of the complete cycloidal arch is s = J ds = 8r. The area under this complete arch is 


22 2% g=0 
A=fydx =r? f (1 —2cos@ + cos? 9) dp = 2ar? + 0+ ar? = 3nur?, and is therefore equal to 
) 0 


Q= 
three times the area of the rolling circle. 


Epicycloids. An epicycloid arises mechanically as the locus of a point P, rigidly fixed to a circle k 
of radius r, if this circle k rolls on the outside of a fixed circle K of radius R (Fig.). According to 
the distance a of the point P from the centre 
M of the circle k, one distinguishes the con- 
tracted (a <r), the extended (a > r), and the 
common epicycloid (a=r). If the radius 
|OM|= R-+r turns through an angle gy, then 
the circle k turns through an angle y, where 
gR = yr. The perpendicular MB from M to 
the x-axis cuts off from y the angle (7/2 — 9), 
and the remaining angle ® is 0 = yp+o 
— a/2= (R+ n/r]- y — 2/2. The coordinates 
of the point P are given by 


x= (R-+1r)cosg — acos [p:(R + r)/r]; 
y=(R+ sing — asin [9(R + r)/r). 


19.5-18 Derivation of the equation of an 
epicycloid 


440 19. Differential calculus 


Corresponding to the curves of the cycloids with respect to a straight line, those of the epicycloids 
with respect to the circumference of the fixed circle K have cusps, loops or minima without double 
points. If the circumference 27R of the circle K is an integral multiple of the circumference 2zr of k, 
then the curve has R/r arches. If R/r is a rational number p/q, then because gR = pr, the positions 
of P repeat themselves after circling g times around K. The length | of one arch of the common 
epicycloid (Fig.) is / = 8r(R + r)/R; the area A between the circumference of the circle K and one 
arch is A = x(r?/R) (3R + 2r). 


19.5-19 The 
common epicycloid; 
Rir =3 


19.5-20 Cardioid 


Cardioid. For r = R one obtains the cardioid (Fig.) with the parametric representation x = 
R(2 cos g — cos 29); y = R(2 sin » — sin 2¢). By eliminating @ one obtains the algebraic equation 
(x? + y? — R?)* = 4R7[(x — R)? + y?]. The length / of the curve is /= 8R, and its area is 
A = 6xR?, that is, six times the area of the fixed circle K. 

In the &, 7-coordinate system the equation of the cardioid becomes particularly simple: & = x — R, 
7 = y, so that € = 2Rcos o(1 — cos¢—), 7 = 2Rsin p(1 — cos ¢). Taking polar coordinates r, #, 
where r = 2R(1 — cosq) and cos# = é/r = cos@ and sin? = 7/r = sing, the polar equation 
of the cardioid is r = 2R(1 — cos %). 


Hypocycloids. In contrast to the epicycloid, the hypocycloid arises mechanically when a circle k 
rolls without slipping on the inside of a fixed circle K. One can think of the moving circle k as being 
rotated about the tangent. The segments r and a, together with the angle of rotation y change their 
signs. The parametric representation then takes the form: 


x = (R—1r)cosgy + acos [y(R — r)/r]; »y = (R— r) sing — asin [~(R — n)/r). 


In a contracted hypocycloid a <r, in an extended one (Fig.) a > r, and in the common hypocycloid 
a =r. The corresponding curves have rounded 
cusps (minima related to the fixed circle), loops 
or cusps. 

The form of the hypocycloid depends on the 
ratio R/r. If it is an integer, then the curve closes 
up after a single rotation of the moving circle about 
the fixed one. If it is not integral, but rational, 


19.5-21 Extended hypocycloid 


19.5-22 Astroid 


19.5. Application to plane curves 441 


R/r = m/n, (m and n having no common factor), then the curve closes up only after 7 circuits. For 
irrational values of the ratio R/r the curve is not closed. If R/r is an integer, then the common 
hypocycloid has the length / = 8(R — r), and the area A between one complete arch and the fixed 
circle K is A = 2(r?/R) + (3R — 2r). 

The astroid or star curve is a common hypocycloid with 4r = R. Its parametric representation 
is therefore x = 4r cos? gp = Rcos* 9, y = 4rsin® y = R sin® g, because sin 3y = 3 sing — 4sin* 
and cos 39 = 4 cos* g — 3 cos g. In Cartesian coordinates one obtains the equation x?/> +- y?/3 = R?/3 
(Fig.). 

Tractrix. A heavy point P at the end of an inextensible thread of length a describes a tractrix 
if the end-point K of the thread moves along the x-axis (Fig.). Thus, the thread is stretched in the 


direction of a tangent to the curve, so that 2. = {y//(a? — y?). Integration gives the equation 
x =aln|[a + V(a? — y?)]/y| = (a? — y?) = cosh™*(a/y) = (a? — y?). The point A is a cusp. 


The length of arc / measured from A is 
{= a In (a/y). 


19,5-23 Tractrix, a= 5 


19.5-24 Cissoid 


Cissoid. Let a circle touch two parallel lines at a distan- 
ce a apart. On a secant through one point of contact O the 
circle and the other parallel line cut off the segment |OR'. 
The point P on the secant is then constructed so that |OP) 
= |QR), and the cissoid is the geometric locus of all points P 
constructed on each secant through O in this way (Fig.). 
The secant makes the angle @ with the x-axis in the 
Cartesian coordinate system introduced into the figure; 
m= tan @ serves as a parameter. The point 2 has the coor- 
dinates x, = a and y, = am; from(x—r)*? + m?x? =r? 
the coordinates of Q are obtained as x, = a/(1 + m7’), 
¥2 = am/(1 + m?). Thus, those of P are given by x 
=X, — X2=am?/(1+m’), y=y1 — 2 =am3/(1 +m’). 
With the help of m? = x/(a — x) and 1 + m? =a/(a — x) 
the parameter can be eliminated, giving y* = x?/(a — x). 
In polar coordinates this gives r? = x? + y? 
= a*m*/(1 + m*) or r =a sin? g/cos gy. The point P is 
a cusp and the parallel x =a is the asymptote of the 
cissoid. The area A between the curve and the asymptote 
is. A = 3ma*/4, and hence is three times the area of the 
given circle with the radius a/2. 

Strophoid. In a Cartesian coordinate system the 
point A(—a,0) is the vertex of a pencil of lines. 
If a line of the pencil cuts the y-axis at the point 
B(0, 6), then the points P and P’ on it for which 
\BP| =|BP’|=|O8B| are points of the strophoid (Fig.). 
From the equation of the line »y=(b/a)x+ 6 and 
19.5-25 Strophoid: portions of the same colour are of equal 
aréa 


442 19. Differential calculus 


the distance condition (y — 6)? + x? = b?, b is eliminated; from 5b = ya/(a + x) one obtains 
y? |x? = (a + x)/(a — x). It follows that |AP’| = (a — x) Y(1 + 52/a), JAP | = (a+ x) Y(1 + 5?/a?), 
and because OB | AO, the secant-tangent theorem gives | AP| - | AP’| = a*. The points of the strophoid 
go into one another under the transformation by reciprocal radii. If g is the angle between the 
+x-axis and the line OP, m = tan y = y/x serves as a parameter. The equation of the strophoid 
gives m? = (a + x)/(a — x) or x = a> (m? — 1)/(m? + 1), y = am: (m? — 1)/(m? + 1) and the 
polar equation follows: r? = x? + y? = a? -(m? — 1)?/(m? + 1), or r = —a+ cos 29/cos gy. The 
point (0, 0) is a double point with the tangents y = +x. The line x = a is an asymptote. Half the 
area of the loop is A’ = a* — na?/4, and half the area between the curve and the asymptote is 
A =a’ + na*/4., 


Conchoids of the line. A line passes through the origin O of a Cartesian coordinate system and 
cuts the line x = a parallel to the y-axis in the point Q. The two points P and P’ on OP, on opposite 


19.5-26 Conchoids: a) a = 2,c = 2; 
b)a=2,c=4;c)a=2,c = 1.5 


sides of Q and at a constant distance c from 
it, are points of a conchoid, which consists 
of all pairs of points P and P’ determined 
in this way. 

The form of the conchoid depends on 
the ratio of the segments a and c (Fig.). For 
c = a, Oisa cusp; forc > ait is a double 
point. The parallel x =a is always an 
asymptote of both branches. If @ is the 
angle between the line OP and the x-axis, 
then its intersection Q with the parallel is 
at a distance rg = a/cos gy from O. Hence 
r=a/cosg +c is the polar equation of 
the curve. From this it follows that 
x=rcosg=a+eccosg and y= rsing 
=atang+csing. Eliminating the tri- 
gonometric functions, one obtains an 
algebraic equation of degree 4: from 
(x — a)? = c? cos*?gm and (x? + y?) 
= a?/cos? y + 2ac/cos y + c? it follows that 
(x—a)? (x?-+y?) = c?(a + ecos¢y)* = c?x?, 


19.5-27 Archimedean spiral, a = 2 


19.5. Application to plane curves 443 


Spirals. The term spirals denotes curves whose radius vector r is a single-valued function of the 
vectorial angle », r = f(y), where » goes from 0 or from —oo.to -++co and r(y) can be different 
from r(y + 22). Examples are r = ag or r= a e*?, 

Archimedean spiral. In polar coordinates this spiral (Fig.) has the equation r = ay. Points P,, P2,... 
on the same radius vector are at a constant distance 27a apart, because r, = a(y + 22) = ap + 2na 
=r, + 2na. The element of arc ds has the value ds = a (1 + g?) dg; consequently the arc length 
is given by 


1 
s= a f V1 + 7) dy = (a/2) (yy V1 + 9) + sinh ,). 


For large values of g, the approximation s ~ (a/2) y? holds. The area of a sector between two radius 
vectors r; = ag, and rz = ag is A = (a?/6) (v3 — 93). 

Logarithmic spiral. In polar coordinates 
this spiral has the equation r= ae*?, 
k > 0. For negative values of » the curve 
winds with decreasing radius vector about 
the pole O, ever closer to it: this is an 
asymptotic point. It was shown in the treat- 
ment of differentiation of functions in 
polar coordinates that every line through 
the pole O cuts the logarithmic spiral at 
the same angle t) = arccot k (Fig.), and 
that the tangents at these points of inter- 
section are parallel to one another. In 
addition it was shown that 


—— =r =r/tant=rk or amas 


Using a relation derived in Chapter 20., 
the arc length s can be calculated: 


dr \? 
r= Y(P + (a) )o 
= V(r? + r?k?) dp 
=rV(l + k?) dp 
= (1/k) Vl + k?) dr, 
or s = (1/k) VU + k?) (2 — 14). 


19.5-28 Logarithmic spiral 


20. Integral calculus 


20.1. The definite integral ............. 444 Two-dimensional integPals ......... 463 
The definite integral asa limit ..... 445 CUDQUUFE vi sig: ast 954: 0 < op araruph eee ok 466 
Properties of the definite integral ... 446 Arc length and surface area ....... 468 
Quadrature 2... ccc ccc cece eens 450 Line and surface integrals ......... 470 

20.2. The indefinite integral ............ 454 Applications in mechanics ......... 472 
Standard integrals .........0c000. 454 20.4. Vector analysis .................. 475 
Integration by parts ..........000 455 PAS gy dda bie digs tuteainea k-o eave 475 
Integration by substitution ........ 457 Gradient and potential ............ 476 
Classes of elementarily integrable Divergence and theorem of Gauss .. 477 
VUNCTIONS sctshwcwnda ds Cae ais s 459 Curl and Stokes’ theorem ......... 478 
Integrals that cannot be expressed in The operator nabla, rules of calcu- 
terms of elementary functions ...... 462 VOTO ics. soe Snead ob eed ihe oe 479 

20.3. Integration of functions of several 
VALIAUIES 2st aese naw okey cdanvee 462 


It is the task of the differential calculus to study the properties of the derivative of a function, 
which gives a limit representation, for example, for the slope of the tangent at a point on the graph 
of a function. This value of the derivative at a point depends only on the values of the function 
in an arbitrarily small neighbourhood, that is, on local properties of the function. The problem of 


444 20. Integral calculus 


defining rigorously and of calculating the area of a region bounded by a closed curve, or the volume 
of a region bounded by a closed surface, led to the discussion of a limiting process of an entirely 
different kind, the study of which is the object of the so-called integral calculus (integer, lat. whole). 
The limiting process arises when the given plane or solid region is approximated more and more 
closely by regions whose area or volume can be calculated by elementary methods. While the 
tangent problem refers to a local property, the area or volume problem requires a knowledge of 
the bounding curve or surface as a whole. 


A 


20.1. The definite integral 


Already ARCHIMEDES (287-212 B. C.) succeeded in proving that the 
area of a segment ASB of a parabola is (4/3) times the area of the 
triangle ABC (Fig.). Since Greek mathematicians calculated areas by 
using geometrical methods, which aimed at converting the given region 
into a square of equal area, one spoke of the problem of quadrature, 
and in the 17th century of the method of exhaustion, to indicate that 
the region whose area or volume is required can be exhausted by a 
sequence of regions of known area or volume. By means of suitable 
decompositions of regions into smaller pieces KEPLER (1571-1630) 
obtained formulae for the volume of barrels, and CAVALIERI 
(1598-1647) developed a comparison principle to decide when two 
bodies lying between parallel planes have the same volume (see Chap- 
ter 8.). Further investigations were made by DESCARTES (1596-1656), 20.1-1 Quadrature of a 
FERMAT (1601-1665), and PascaL (1623-1662) in France, GuLDIN Segment of a parabola 
(1577-1643) in Switzerland, and WALLIs (1616-1703) in England. Build- 
ing on the foundations of this preliminary work, LEIBNIZ (1646-1716) and NEWTON (1643-1727), 
independently and almost at the same time, created a satisfactory calculus for the computation 
of areas and volumes. Furthermore, they discovered that despite the different limiting processes 
involved there is a close connection between the tangent problem and the quadrature problem. If 
the definite integral is regarded as a function of the upper limit of integration, then its derivative is 
equal to the integrand. The standard notation for integrals, which is due to LEIBNIZ, corresponds 
to this important fact, which is known as the fundamental theorem of the calculus. 

Side by side with the problem of calculating areas then stands the second main problem, namely 
that of finding a function whose derivative in a certain interval is equal to another given function. 
The integral calculus develops a general method of treating these apparently very different problems 
and of investigating properties of integrals. Together the differential and integral calculus form the 
foundation for the entire branch of higher analysis and are indispensable for modern science and 
technology. 

The quadrature problem suggests an intuitive method of defining the definite integral as the area 
F® of the region bounded by the curve of a function y = f(x) between the ordinates f(a) and f(d), 
the piece of the abscissa axis from x = a to x = J, and the two straight lines x = a and x = 5 
(Fig.). Here it is assumed that the curve is the graph of a continuous function and that all its ordinates 
are positive (that is, that the curve lies above the x-axis). 


20.1-2 Area below the curve y = f(x) 
between x = a and x = b 


20.1-3| Upper sum and lower sum 


The area F? can be approximated from below and above by sequences of suitable figures, for 
example, by step polygons, which are included in, or include, the given region (Fig.). These areas can 
be calculated elementarily as sums of areas of rectangles. It is intuitively clear that the approximation 


20.1. The definite integral 445 


to F® will improve as the size of the steps gets smaller, so that the sequences formed by the areas of 
the inscribed and circumscribed step polygons should approach a commen limit. The area F® is 


defined to be this limit, and the limit (assuming that it exists) is denoted by i f(x) dx (read: integral 
from a to b of f(x)) and called the definite integral. 


The definite integral as a limit 


Limit of upper and lower sums. One divides the interval from a to b into n subintervals 4~x,, 
i= 1,...,n, for example, into n equal intervals 4x, so that n Ax = b — a; for simplicity both the 
interval and its length are denoted by Ax. The ordinates at any two adjacent points of division 
bound a strip of width 4x. If one chooses as heights m,, the smallest, and M,, the largest ordinate 
in the ith interval and compares the area of the strip with the areas m,Ax, M,Ax of the corresponding 
rectangles, then the sum >’ m,4x is smaller, and the sum >” M;Ax is larger, than the area required. 
These are called upper sums and lower sums (Fig.). If one refines the subdivision by splitting up 
the subintervals, then the new smallest ordinate will be at least as large as the original m;,, and the 
new largest ordinates not larger than the original M,. Thus, the lower sum increases and remains 
less than F®, the upper sum decreases and remains greater than F®. The lower sums form a monotonic- 
ally increasing bounded sequence, the upper sums a monotonically decreasing bounded sequence, 
and therefore both have a limit. If these limits are identical, that is, if the sequence of their dif- 
ferences 5'(M, — m,) Ax tends to zero as 4x + 0 and n > oo, then (and only then) does the integral 


b 
J f(x) dx exist. This always happens for continuous func- 
a 


tions. 


27 
20.1-4 The integral J = f sin y dg = 0 as an area 
0 


The original assumption that the curve lies above the x-axis, or that the function has only positive 
values, is not necessary. For parts of the curve below the x-axis the oriented area takes negative 
values because in the corresponding parts of the sums the ordinates are negative, while the intervals 
Ax, are still positive. The diagram shows by the example of the integral of the sine function from 0 
to 27 that the area can have the value zero. Running along the x-axis with increasing x, positive 
areas lie to the left and negative areas to the right — in accordance with the definition of oriented 
area in plane geometry. 


Analytical definition of the definite integral. The geometrical significance of the definite integral 
as an area can be dispensed with. Let f(x) be a bounded function in the interval a < x < b; again 
one divides this interval by division points a = x9 <. xy << X22 <0 +++ < X,_1 < xX, = D into n sub- 
intervals of lengths 4x; = x; — x;_; (é= 1,2, ..., nm). Since a bounded function need not have a 
greatest or least value in an interval, one forms upper sums by using G,, the least upper bound 
(supremum) of the function in the subinterval, and lower sums by using g;, the greatest lower bound 
(infimum). If the difference '(G; — g;) 4x, tends to zero as the length of the largest interval 4x, 
tends to zero when n —» oo, then the upper and lower sums both tend to the same limit. Also every 


sequence F, = p> f (€;) A4x;, where &; is an arbitrary point in the subinterval 4x;, then tends to this 
limit. 


If the limit lim F, = | of the sums F,, = ¥ se) Ax, exists and is independent of the 
aw oo; 4 xy + 0 i=l 


choice of division points x, and intermediate points €,, then the function if (x) is said to be integrable 


over the interval [a, 6] and the limit / is called the definite integral f f(x) dx of f(x) from x = a 

tox = 4. 

This concept of integral goes back to Bernhard RIEMANN (1826-1866); the Lebesgue integral is a 
modern extension of this concept. The function f(x) is called the integrand, and x is called the 
variable of integration; a and b are the lower and upper limits of integration. The value J of the integral 
of a given integrand f does not depend on the name given to the variable of integration. 


If the integrand is continuous in the closed 
interval [a, b], then the integral certainly exists. 


Every continuous function is integrable. 


446 20. Integral calculus 


All these arguments remain correct if the function has 
finitely many discontinuities in the interval [a, 5], but is 
bounded (Fig.). 


Every function that is bounded and has only finitely 
many discontinuities in [a, b] is integrable. 


More complicated functions, such as the function which 
for 0 < x < 1 has the value 1 for rational x and 0 for 
irrational x, are treated in measure theory (see Chapter 35.). 


20.1-5 Area under the parabola y = x? 


Example: To calculate the area under the parabola y = x* between x = 0 and x = a, one 
divides the interval [0, a] into m equal parts of length A = a/n (Fig.). The m left end-points of sub- 
intervals are x» = 0, x, = A, ..., X,_, = (nm — 1) 4A, the vn right end-points are x, = 4, x. = 24, ..., 
X, = nh = a, Since the function y = x* is monotonically increasing in [0, a], the lower sum F, 
and upper sum F, are: = 


F,=0+ 17h? -h + 27h? -h+e--) +-(n—1)7?A727h) Fl = 17h? +h + 27h? +h + + + nth? -h 
= fA--[il* + 27 +.--- + (n — 1)*] = A*(1? + 27 + --- + vn?) 
,. (a— 1) a(n — 1) 3 a(n +- 1) (2n +- 1) 
= fh o>, Sen te = f° ra 


en | | l 

— —— aa ] | = 2 ———— 
“n 6 : (1 =) ( 4 
Both sequences have the same 
limit F = '/,a* as n-» co, 


so that the area under the parabola between 

x = O and x = a is '/,a%. The adjacent table, for 
a = 6 and F = 72, shows how the upper and 
lower sums vary with vn. 


Properties of the definite integral 


If one interchanges the limits of integration, one alters the direction of integration; the factors 
Ax, change sign as oriented segments, and so do all the sums F, = »' f(&;) 4x, and the integral, 
because the ordinates f(&;) do not change their sign. To preserve the formula below, for a = 5 one 


defines i f(x) dx = 0. 


Interchange of limits of integration changes the sign of the integral. | a 


Sums of integrals. If f(x) is integrable over the interval from x = a to x = 5, and if c lies in this 
interval, then c can be made a division point of each subdivision, and so any sum )»’ f(é;) 4x; can 
be split into two sums, one for the interval from a to c and the other from c to b. Since the limit 
of a sum is equal to the sum of the limits, the definite integral can be written as the sum of two 
others. By the same theorem about limits: 


The integral of the sum of two integrable functions f(x) and g(x) is equal to the sum of the integrals 
of the separate functions. 


The splitting of the interval of integration can be used to separate out positive and negative 
contributions to signed areas by using the zeros of the integrand as splitting points. For instance, 


2% zt 27 
I=fsing dg = f singdp + fsingdg =2—2=0. 
0 0 7% 


20.1. The definite integral 447 


For integrands with finite jump discontinuities it is also neces- 
sary to split the range of integration at the point of disconti- 
nuity (Fig.). But one must beware of integrating across points 


where the integrand becomes infinite; for instance, the 

1 | ; 
; dx ; . ( 
integral { — does not make sense because the interval of inte- | 


—— 


-1 i ana. eee 
gration contains x = 0, where the integrand is infinite. Such 
cases will be treated in greater detail in the section on improper _—_jyous function 
integrals. 


c b b 
Integration of a function with a constant factor. If the inte- J f(x) dx + ‘| f(x) dx = F(x) dx 


20.1-6 Integral of a disconti- 


grand is a product cf(x), where c is constant, then .»' cf(é;) 4x; 
=c »' f(€,)4x;, and so c appears as a factor outside the integral. 


a a 
Example: fatdx=4 fx? dx =4-27/3 = 36. 


Mean-value theorem of the integral calculus. Every continuous function f(x) attains, in a closed 
interval [a, b], its maximum M and minimum m. It follows from the definition of the definite integral 


that 
b 


m(b — a) < f f(x) dx < M(b— a); 


the area under the curve lies between the areas of rectangles 
with base b — a and heights m or M. Therefore there must be 
a u between m and M such that the rectangle of base b — a 
and height u, hence of area u(b — a), has equal area with the 
region under the curve (Fig.). By the continuity of the func- 
tion f(x) one can always find a place é in [a, b] with u = f(6); 


b 
hence it follows that { f(x) dx = (6 — a) f(6). As in the mean 


value theorem of the differential calculus, € can also be 201-7 Mean value theorem of the 
written as a + &(b — a) for a suitable 0 between 0 and 1. integral calculus 


If the function f(x) is continuous in the closed interval [a, 5), then the definite integral of this 
function from x = a to x = 5 can be expressed as the product of the length of the interval and the 
value of the function at some intermediate point of the interval. 


The mean value theorem is used, for example, in estimating definite integrals of functions that 
cannot be integrated in elementary terms or whose integral is difficult to find. If f(x), g(x), h(x) are 
integrable functions on the interval a < x < b which satisfy the inequalities f(x) < g(x) < h(x), 
then 


(are | aes eds. 


Example: In the interval 0 < x < 1/2 the function e-*", which cannot be integrated in elementary 
terms, satisfies the inequalities 1 — x? < e-** < 1/(1 + x?). It follows that 


ifa 1/2 ia | 
0.458 = [x — x3/3]}/? = (l—x*)dx < fe dx < Fax + x*) = [arctan a]j/? = 0.464. 
For completeness’ sake, the following generalized mean value theorem is quoted without proof: 
If f(x) and g(x) are continuous in the closed interval [a, b] and if g(x) does not change sign there, 
then for a suitable & in [a, 6), 


b b 
J £(%) g(x) dx = f(§) J g(x) dx. 


Integration as inverse to differentiation. If one of the limits of integration of a definite integral, 
say the upper limit, is regarded as a variable x, then to each value of x there corresponds the value 


448 20. Integral calculus 


@(x) of the integral. T ne pease is a function of this limit, B(x) = if f(&) dé. Now on the one hand, 


@O(x + Ax) — D(x) = “f “f (€) d&; on the other hand, by the mean value theorem of the integral 
calculus, 


x+ dx 


@ Ax) — @ 1 
Pet —%) _t f f(E) dé = f(x + 6 Ax). 


: THe d&é = Ax: f(x + 8 Ax); hence re 


This relation represents a fundamental connection between the differential and integral calculus. 


If the function f(x) is continuous, then the function @(x) = j J(&) dé is differentiable, and its 
derivative is equal to the value of the integrand at the upper limit of integration 


O(n =T [ 1a =Se. 


Every function ®(x) whose derivative is equal to the integrand f(x) is called a primitive of f(x). 
For two such primitives ®(x) and W(x) of the same integrand f(x) the derivative of Y(x) — B(x) 
is identically zero so that, by the mean value theorem of the differential calculus, Y(x) — ®(x) is 
constant. 


To each continuous function f(x) there belongs a family of primitives, any two of which differ by 
a constant. If D(x) and ‘¥(x) are primitives of the same function f(x), then P(x) = (x) + const. 


The graphs of the functions ®(x) therefore 
form a family of parallel curves (Fig.). The class 
of all primitives is called the indefinite integral of 
f(x) and is denoted by { f(x) dx. 


Examples: [. The indefinite integral of the func- 
tion f(x) = 3x? is (x) = x* + const, because 
x? has the derivative 3x?. 

2. The derivative of sin x is cos x, so that the 
integral of cos x is sin x. Written as an indefinite 
integral this is | cos x dx = sin x + C. 

3. The function f(x) = 2x? is the derivative of 
its indefinite integral ®(x), If it is also required 
that ®(1) = 1, then the constant C in the in- 
definite integral f 2x? dx = (2/3) x? + C can be 
oe from | = ®(1) = 2/3 + C. Thus, 

= 1/3 and ®(x) = (2/3) x° +1/3. 

°F For a function ®(x), its derivative ®’(x) 
=3x* and the value ®(5)=0 are known. From 
the indefinite integral in 3x* dx = (3/5) x7 +C 
and the condition ®(5) = 0 one derives (3/5) - (5°) 
+cC=0, so that C = —1875 and therefore 
D(x) = (3/5) x5 — 1875. 


The definite integral as ordinate difference. 

" S 
f (xi/4) d Logan 2+ = Knowledge of the indefinite integral makes it 
easier to calculate the definite integral. Let D(x) 
be any primitive of f(x), and note that 


W(x) = ff (€) dé is also a primitive, so that W(x) = ®(x) + C. But Ya) = f F(&) dé = 0, so that 
C = —@(a), and therefore ° 


20.1. The definite integral 449 
2 ‘=? ; 
Examples: 1. f x? dx = [x3/3] = 8/3 — 1/3 = 7/3. 


| nf 
2."F cos x dx = sin x |, 
= sin (2/2) — sin (—2/2) = 1 — (—1) = 2. 


20.1-9 The definite integral as ordinate difference 


Improper integral. The integral defined as the limit of sums D' f(é,;) 4x, is called a proper integral, 
in contrast to an improper integral, in which either the integrand becomes infinite at some point p 
of the interval of integration or at least one of the limits of integration is infinite. The improper 
integral is defined as a limit of proper integrals. The interval of integration is first restricted; in the 
first case one integrates only to p + € or p — é¢, in the second case only to , and then considers 
the behaviour of the corresponding integrals I(e) or I(w), as e + 0 or w > too. If I(e) or I(w) tends 
to a finite limit, one speaks of a convergent improper integral, and if not, the improper integral is 
said to diverge. 


Integrand with an infinity. In the following integral the integrand has a pole at x = 0 (see Chap- 
ter 5.2. — Power functions with negative exponents). Excluding this place, one obtains in the 
interval from ¢ > 0 to 1 a proper integral (see Standard integrals) J(e) and investigates the limit 
of I(e) ase > 0: 


1 1 
1 , : 1 
ae ae 8 
€ 


= lim |= 


a— eta) ; 


20.1-10 Behaviour of the functions 
f(x) = x-™* and f(x) = x“ near x = 0 


The existence of the limit depends on the positive exponent « + 1. If « > 1, then «!-* has a 
negative exponent and certainly diverges as «> 0. On the other hand, if « < 1, then e1-* +0 
as «0; the improper integral then converges to 1/(1 — «). The figure displays the two cases 
o = 2/3 << landa = 2 > 1. More generally, the following comparison principle (majorant criterion) 
holds: Suppose that the function f(x) is integrable in the interval a < x < p — « for every ¢ > 0, 
but is unbounded in the interval p — ¢ < x <p. If there exists a number « < 1 such that the func- 


tion (p — x)* f(x) is bounded for a < x < p, then the improper integral f | f(x)| dx converges and 
Pp a 
therefore so does f f(x) dx. If there exist an « < 1 and a bound K such that |f(x)| < 


a@ 
in the interval [a, p), then one has the estimate 
P p-é P-é 


[p — x|* 


J (ode = tim J eo] ax — tim, f Te ax Se tn 


a 


Example: By Standard integrals one has 


eo0 


1 l-e 
dx ey MER Fe he Sr WEEN Vee cick 
are i ee 


450 20. Integral calculus 


Infinite interval of integration. The contrast to an improper integral in which the integrand becomes 
infinite can again be clearly illustrated by the example f(x) = 1/x*. 


co @ 

1 a 25, aie [ dx ss. 1 ae 
popetieeiia elim tee | = Be ee 
1 1 


If « + 1 is Positive, then the behaviour of J(w) depends on that of w!~*; if the exponent 1 — « iS 
positive, that is, « < 1, then J(w) — co asw — oo and the integral diverges; if the exponent is negative, 
that is, « > 1, ‘then wi-a + 0 as w - oo and the integral converges to the value 1/(« — 1). Again 
there is a comparison principle (majorant criterion): 


If f(x) is integrable on any finite sub-interval of x > a, and if there exists an «x > 1 such that the 
oo 
function x*f (x) in bounded for all x > a, then the improper integral { f(x) dx is convergent. 


If K is a bound for the function x*|f(x)|, and « > 1, then one has the estimate fl f(x)| dx 


= lim 2 f [f(x)| dx < lim aE dx = as 


a— 1 


gaa J. In the ane improper integral 
the integrand vanishes to the second order as 
x — oo (Fig.) 


f i Aa lin dx 
1+x* = w+0J 1+ x? 
0 0 20.1-11 Area under the graph of the func- 


= lim (arc tan » — arc tan 0) = 2/2. tion f(x) = 1/(1 + x?) 
se, m= oo ‘ 
2. fe" dee lim e*~ dx = lim (—e® + 1) = 1. 
n= oo a= o> 


The gamma function. The problem of finding a function whose vane for postive integral arguments 
are the factorials 1! = 1, 2! =1-°-2=2, 3!=1-2°3=6, ..., n! =1.-2:----m was solved by 
EULER (1707-1783) by a of an improper integral. LEGENDRE (1752-1 833) called this Euler’s 


gamma function I'(x) = i e—*t*—-1 dt; Gauss (1777-1855) gave a definition of I(x) as an infinite 
product 


I(x) = x7} II [1 + 1/n)* (1+ x/n)-*]. 


The function is never zero and is continuous except at x = 0, —1, —2, ... where it has simple poles 
(see Chapter 5.). Its factors are obtained by substituting m = 1, 2, 3,... in the square brackets. 
From the functional equation J’(x-+1)=xI(x) and the value [7(1)=1 it follows that 
[a+ 1) = nl) = n! for integral arguments n = 1, 2, 3, .. 


Quadrature 
By quadrature one understands the calculation of areas of plane regions with curved boundaries. 


Area under a curve. The area F above the x-axis and Pa the curve with the equation y = f(x), 


between x = a and x = 3, is found from the integral F = i f(x) dx if all the values f(x) are positive 


20.1. The definite integral 451 


in the interval a << x < b. If f(x) changes sign in [a, 6], one may split this integral into pieces in 
which f(x) takes positive and negative values, and the integral into a sum of positive and negative 
contributions corresponding to oriented areas. If one ignores orientation, the total area is the sum 
of the absolute values of these contributions. 


Example 1]: Quadrature of Neil's parabola y = ax*/* (Fig.). F=a fix 2 dx = 2ag*/?/5. If 
h = a+ g?/?, then F= 2gh/5, so that the area F is by gh/10 less than the area of the right-angled 
triangle with base g and height A. 


20.1-12 Area under the positive branch 
of Neil’s parabola 


20.1-13 Area under the 
exponential curve y = e* 


Example 2: Quadrature of the exponential curve y = e& from x =a to x= 5b, If a is finite, 
b | : ; 
f ex dx = e? — e*. As a —ox, F tends to the finite limit F = e’; the improper integral [ e* dx 

—oo 

converges and (Fig.) the region extending to infinity has finite area. 

Example 3: Quadrature of the rectangular hyperbola y=k?/x from x=a to x=); 

4 
k? | ce = k?(In b — Ina); F = k? In (6/a) (Fig.). Here one has a ‘rational’ curve for which 
x 


the calculation of area involves a transcendental function. 


Example 4: Quadrature of the sine curve y= sin x from x= Otox=a;f'sin x dx=cos 0—cos 7; 
F = 2. Here one has a ‘transcendental’ curve with rational area. 


20.1-15 Area under 
the curve of the 


here = function y= f(x) 
if eB a 
20.1-14 Area under the hyperbola y= A*/x — 3x/2 +3 


Example 5: To calculate the area of the region bounded by the curve y = f(x) = x°/3 — 5x?/6 
— 3x/2 + 3, the x-axis, and the lines x = —3 and x = 4 (Fig.). In the interval -3< x <4 
the function has zeros at x; = —2, x2; = 3/2, x3 = 3. If one ignores orientation, the area is the 


452 20. Integral calculus 


sum of the absolute values of the integrals over the subintervals between zeros. Since 
f (3/3 — 5x?/6 — 3x/2 + 3) dx = F(x) = x*/12 — 5x°/18 — 3x?/4 + 3x, 
this leads to the following evaluation of the area: 
—2 3/2 3 4 
F =| J f(x) dx| + | J F(x) dx| + [fo dx| + | [ f(x) dx 


= |[FO)1E3| + |PFO)123| + [LF )2l + |[FOO15| 
= |—5.444 + 1.5] + |2.297 + 5.444] + |1.5 — 2.297| + |3.556 — 1.5| 
= |—3.944| + |7.741| + |—0.797| + |2.056| = 14.358. 


The integral calculus may be used to derive the area formulae known in plane geometry. The for- 
mulae for the trapezium, the circle, and the ellipse will be developed as examples below. 


Example J: Area of the trapezium: To calculate the area below the line y = mx + a, and 
above the x-axis, between the limits x = 0 and x = h: :[(mx +a) dx = mh?/2 + ah = h(mh+ 2a)/2; 
setting mh + a = b (Fig.), one obtains the familiar trapezium formulae A = (a + 5) A/2. 


20.1-16 The area formula for a trapezium 20.1-17 The area formulae for circle and ellipse 


Example 2: Area of the circle. A quarter of the circular region A (Fig.) lies below the arc 
y = V(r? — x?) between the limits x = _ and x = r. Substituting x = rsing, dx = rcosp dg 


one obtains 4/4 = f /(r? — x2) dx = jr cos? p dy = (r?/2) [sin p cos p + pls!2= (r?/2) + (7/2) 
= nr*/4, or A = er: 

Example 3: Area of the ellipse. A quarter of the elliptic region F (Fig. 20.1-17) lies below the 
arc y = (b/a) (a? — x*) between x = 0 and x = a. The parametric representation x =asing, 


y =bcos@, gives dx = acosgdg; hence the quarter area is Fi4 =| ‘ab cos*'y dy'= xab)4, 
that is, F = mab, 


Area between two curves. If a region is enclosed by two intersecting curves, its area can be cal- 
culated as the absolute difference between the areas under each of the curves. The limits of integration 
are the abscissae x,, x2 of two consecutive intersections of the curves (Fig.); orientation is to be 
ignored. The areas under the curves y = g(x) and y = A(x) 


Xs Xs 
are given by the integrals f g(x) dx, f A(x) dx; the area of 
xy *1 
the region enclosed by _ curves is - difference of these 
integrals, that is, F = | if g(x) dx — j h(x) dx|. Since both 


integrals have the same “limits of integration, they can be 
combined into a single integral. If parts of the curves be- 
tween x, and x2 lie below the x-axis, one may shift both 
curves in the direction of the y-axis until the region between 
them lies entirely above 


glx) 


xs ; a 


the x-axis. This changes both functions by the same additive constant, which cancels out after 
subtraction. 


20.1. The definite integral 453 


Example 1: The curves determined by g(x) = 3 |/x 
and f(x) = x? — 4x + 6 (Fig.) intersect at the points 
(1, 3) and (4, 6). Now g(x) — ; A(x) = (3 Vx — x? + 4x 


—6), so. that F = {3 Vx — x? + 4x — 6) dx 
——— 


circie 


, lll TR | 


paravola 


ie 
| | 


fé 


e/a 
Hea Runee & 


20.1-20 Cross cut trough a steel girder 


20.1-19 Area between the curves of the functions y? = 9x 
and y = x? — 4x + 6 


Example 2: Fig. 20.1-20 represents a cross-cut through a steel girder. The upper boundary 
is an arc of a circle, the lower that of a parabola. This makes it possible to calculate the mass of the 
girder when the thickness d of the steel sheet and the density o of the material are known. If A 
is the area of the cross-cut, then its mass is M = A~«d-o. The cross-cut area A can be calculated 
by means of integrals. For a suitable choice of the coordinate system one has the general equations 

x? + (y — 6)? = r? for the circle, and y = ax? + c for the parabola. The given data lead to the 
cineca equations x? + (y + 8)? = 100 for the circle and y = —x?/36 + 1 for the parabola. 

Substituting g(x) — A(x) = (100 — x*) — 8 + x?/36 — 1 one obtains for the required area 


Ane fcyoo ~ x2) + x?/36 — 9) dx 


= 2['/,(100 arcsin (x/10) +- x /(100 — x?) — x3/108 — 9x]J§ 
= 2['/2(100 - 0.6435 +- 48) — 52] = 8.36. 


Graphical integration. Just as one can determine graphically the derivative of a given curve, so 
One can construct, conversely, from the given graph of a derivative an appropriate integral curve. 
One begins by selecting by means of an initial condition from the family of integral curves one, say 
the curve passing through a point Pp with the coordinates x9 = 1, yo = 0. Every ordinate of the 
derivative f(x) represents the slope of the integral curve to be constructed at the point in question. 
Therefore, if one drops a perpendicular from f(xo) = f(1) to the y-axis and joins the foot Bo of 
the perpendicular to the point A = (—1, 0), then the line through A and Bo, on account of tan «9 
= |ByO|/1 = f(1), gives the direction of the integral curve at the point Pp. 

A parallel to this direction through Po is 
tangent to the integral curve and represents 
it approximately in a small neighbourhood 
of P.. Since no further points of the integral 
curve are known, one chooses a point x;, 
halves the interval from x9 to x; by a parallel 
to the y-axis, and shifts the tangential direc- 
tion for the point P, to be constructed so that 
it intersects the tangent at Pp on the mid- 
line of the interval. Continuing in this 
le ae Ce i manner one obtains a polygonal arc, which 
‘tA | represents the integral curve approximately 

: , (Fig.). The drawing of an integral curve for 
a given derivative can also be done mechani- 
cally by means of special tools. 


Fixj=fFix) dx with F(1)=0 


20.1-21 Graphical integration 


454 20. Integral calculus 
20.2. The indefinite integral 


The expression of the definite integral as an ordinate difference of a primitive makes it desirable 
to obtain primitives for as many functions as possible. To obtain these primitives one makes use 
of the fact that integration is the inverse of differentiation. 

Standard integrals 


Standard integrals, apart from the constant of integration, result immediately from standard 
formulae of the differential calculus. If it is known that the derivative of ®(x) is a given function 
f(x), then conversely D(x) + C = f f(x) dx is the integral of f(x). The formulae so obtained are, 
however, valid only where the integrand f(x) as well as the integral (x) are defined. For instance, 


the integral [fs = = In (cx) = In |x| + C is at first only defined for positive x, but can then be ex- 
tended to oat values of x provided that the constant c is given the same sign as x. One can then 


write In (cx) = In ([e| |x|) = In |x| + In |c| = In |x| + C. But it should be stressed again that 
x = 0 must never belong to the interval of integration. 


i Table of standard integrals 
[ax=x40 [x dx=— +; n+—1 
| [erdx=er+c Ledeen x+0 


=z = a* — “ax . & 
[4 c= alog,e+C; O<a+]1 


(2k + 1)x 


dx 
aha | 2 ad ae 3 
[ 20s x ax sin x -+- C | sec x dx iE 7, = tanx + C; x + 5 »kKeZ | 


| sin x dx = —cosx+C [ cosec? xdx = | —cotx+C; x+kxa, kez 


sin? x 


= H dx 5 
| cosh x dx = sinh x + | Se = tanh +c 


[ sinh x dx = cosh x + € [ Fe = -coth x + ¢; x+0 
d 

| yay = Atesin x + C= -—Arceosx+C’; |x|< 1 

[ que = arctan x + C = —arccot x + C’ 


= sinh~' x + C=In|x + ya + x*)| + C’ 


dx 
| yal + x?) 
dx _ fcosh"* x + C=In(x + VW? —- 1) + C’ if x>1 
| y& —]) fae (—x) + C= —In(-x + VO? —1)+C’ if x< -! 
with 0 < cosh~’ x < oo (principal value) 


l+x 
l—x 


x+ 1 


ltanh-’ x + C = In /( 
coth-! x + C, =In y\ 
a= 


)+e' for |x| < 1 


r) +o for |x| > 1 


The rule [= dx = xe 
n 


integers it is enough to tae x =- 0, and for positive integers the rule holds for all x. 


ws C holds for any real exponent m+ —1 and x > 0; for negative 


Examples: i] txt. 2. [S= xd =* 4+ c=- +0 
- a ge ge See Rae AS } 
3. {J = [x dx = par = aay to Gap ti +. 


20.2. The indefinite integral 455 


4.[ Yxdx= [x 1/3 dy = S + C= 3/4 Vxt+ C= 3/4xyx+C. 
Tie dpes | Dim de ie et 
5. [ Yxdx= | x —tiaeT +6 Gamer mg tom aa te. 


6. fran fs a =3 ¥x+C. 


7. [(Sx* + 4x? — 3x + 2) eerie Se aie te Sf ede ral ee 
= 5x4/4 + 4x3/3 — 3x2/2 + 2x + 


8. fax? + 1/x — b/x* + 1/(1 + x?)] ae = ax?/3 + In |x| + b/x + arctan x + C. 
rel Be ii , 
9, [SAAT ae = f(x? + 2x — 1 + 3)x) dx = 09/3 + x? — + 3 In |x| + C. 


x 
+n/4 
10. | (cos x — sin x + I/cos* x) dx = [sin x + cosx + tanx]*™/4 4 
—n/4 


= ("/2 V2+ */2 ¥2+1)—(—*/2 V2+ */2 V2—1) = 24 2. 


Integration by parts 


Sometimes a difficult integral can be found more easily by ‘partial integration’. It is assumed 
that the integrand can be written as the product of two functions of which one is easy to integrate. 
The rule for integration by parts is a consequence of the rule for differentiating a product: 

d(uv) du dv 
=v u—. 
dx dx dx 


Integration on both sides leads to | 


d 
wy) dx = ww = [oS dx + fut dx, and so to the 
dx dx dx 

rule for integration by parts. ‘By parts’ indicates that one writes the integrand as a product uv’ 


of two parts u and v’, where the integral of the part v’ is known and the new integral with the integrand 
u’v is easier to find. 


Li == 
e nar rts 


do = w — | -fod aie. — 


a: Bina TEE weds 
a) yp Seeder" fy : eee i 
Tallon OY De s up dx! y - =f 

. as al ude soir gota a my ce ar a 


Example: To calculate the integral | x e* dx, one puts wu = x and v’ = e*, so that u’ = 1 and 
vp = e*, The rule for the integration by parts yields 


fxe?dx=xe* —fe*dx= xe —e*¥ + C=(x—Ie*4+ C. 


Recurrence formulae. The integral f xe e* dx cannot be found from a single integration by parts, 
but in this case and similar cases one can often make several successive integrations by parts which 
lead step-by-step to simplifications of the integral, until one reaches one of the standard integrals. 
Then one has found a recurrence formula. 


Since e* is equal to its derivative, the factorization u = x", v’ = e* for the untenrane x" e* seems 
promising; this gives v=e*, uw = x"e, ——_____—_ — ~ 
u’v = nx"~* e* and hence the adjacent recur- | 1.) ¥ © dx=xe 
sion formula. 
Example: { x? e& dx = x? e* —2f xe* dx = x* e* — 2[x e* — f e* dx] 
= x* e* — 2[xe* — e* + C] = x* e* — 2xe* + 2e* —2C 
= (x? — 2x + 2)je* + Ci. where Cy = —2C, 


Here the appropriate factorizations of the integrand are easily guessed, because the integrals of 
sin x and cos x are known. 


Examples: 1. { x? sin x dx = —x* cos x — hare cos x) dx 


ae cos x + 2[x sin x — | sin x dx] 
= —x? cos x + 2[x sin x + cos x + C] =—x* cos x + 2x sin x + 2cosx+2C. 


456 20. Integral calculus 


2. { x* cos x dx = x* sin x — 3 [ x* sin x dx 
= x3 sin x — 3[—x? cos x + 2x sin x + 2cos x + 2C] 
= x? sin x + 3x? cos x — 6x sin x — 6 cos x — 6C. 


One factorizes the integrand as 1- (In x)", that is, one puts v’ = 1, u= (In x)", thus, v= x, 
u’ = n(n x)""!- (1/x) and u’v = n(in x)""1. If the exponent nm is a natural number, then 2 — 1 
partial integrations lead to the integral { In x dx, whose integrand can again be written as a product 
1-Inx, which leads to f1-Inxdx=xInx—Jfx-(1/xy)dx=xInx—x+C. For n=—1 
one obtains the logarithmic integral (see Chapter 21.). 


art 
n+ 1 
contains In x and can be integrated at once; a recurrence formula is not needed here. 


Here one puts v’ = x" and u = In x, so that the integrand v- u’ = no longer 


oe | 


The integrand, for example sin" x, is written as a product sin x-sin"~1 x; thus, one may put 


u= sin"! x, v’ =sinx to get v =-—cosx, u’=(n—1)sin"?xcosx, hence uwv= 
—(n — 1) sin"-? x cos? x = —(n — 1) sin"~? x (1 — sin? x) = —(a — 1) sin"~? x + (n — 1) sin" x. 
Therefore f sin” x dx = —sin""1 x cos x + (n — 1) f sin""? x dx — (n — 1) f sin" x dx, 


(1 + 2— 1) f sin" x dx = —sin""! x cos x + (n — 1) f sin""? x dx. Division of each side by n 
leads to the first recursion formula; the second is obtained similarly. If 7 is a negative integer, one 
uses the recurrence formula to express the integral on the right in terms of the integral on the left. 


Examples: 1. { sin? x dx = —(cos x sin x)/2 + (1/2) f dx = —(cos x sin x)/2 + x/2 + C. 
ca eni (1/2) [sin x cos x + x]5* =z. 
af cos* x dx = (1/3) sin x cos? x + (2/3) [ cos x dx = (1/3) sin x cos? x + (2/3) sin x + C. 
«f sin? x dx = —(1/3) [cos x(sin? x + 2," = 0, 
Wallis’ product formula. The previous recurrence formulae lead to a representation of 2/2 which 
was found by John WALLIs (1616-1703). Since 0 < sin x < 1 in the interval 0 < x < 2/2, the 


inequality sin?*+1 x < sin?* x < sin?*-1 x holds for natural numbers k > 1. On the other hand, 
the recurrence formulae lead to: 


” wk—-1 7 2k — 1) (2k — 3) 1 
[ sin x dx = = f sin2®-? x dx —§ AO nS 


J 2k 2k(2k — 2)--- 2 2° 
0 0 
™ ” 2k(2k — 2) ++ 2 
in2kt1 = sn 2k—-1 = a Ae 
[ sin x dx ka { sin x dx Gk 1)Qk—1)--3 
0 0 
and therefore to the inequalities: 
2°4--. 2k 1-3---@k—1) a _ 2 4... (2k — 2) 
3-5-.-(2k +1) ~ 2°4--- 2k 2 3°5--- (2K — 1)’ 
or | 
1-3--- 2k — 1) 2 4 2ak+1 1 


Since lim (1 + xe) = |, it follows that lim (2k + 1) | : 
2k k+ © 


3 
k—»oo 2°4-+- 2k 


20.2. The indefinite integral 457 


For k = 10, this gives the (rather poor) approximation 2/2 + 1.5339 or 2 * 3.0678. 


Integration by substitution 


An integral may become easier or simpler to find if the variable x is replaced by a new variable z 
by the substitution x = ¢(z), or if some part g(x) of the integrand is introduced as a new variable z. 
In all cases the connection between the differentials dx and dz of the given variable x and the new 
variable z must be taken into account. 


Integrand as function f[q(x)] of a linear function g(x) = mx + c. One substitutes g(x) = mx+c=z 
and notes that m dx = dz or dx = dz/m. The substitution succeeds if f(z) can be integrated. 


Examples: 1. If the integrand is fax + b)*, put ax + b = z, dx = dz/a, to obtain 
(ax 4+ 6)° dx = (1/a) f 2° dz = (1/6a)+z® + C= bal (ax + b)° + C. 
2. The integrand (3x — 4) becomes |/z if one puts 3x — 4 = z; then dx = dz/3 and for the 
integral one obtains 
J¥Gx — 4) dx = YE eae "fy + 7/3 +29? + C= 4/ozyz+C 
= */9(3x — 4) V(3x —4)+C 
3. The substitution wt + 2/2 = z, dt = dats leads to 
f sin (wt + 2/2) dt = (1/m) [ sin z dz = —(1/m) cos z + C = —(1/@) cos (wt + 2/2) + C. 
4. In the integrand e~5* one substitutes —3x = z, dx = — 2/3, to obtain 
J o-9* dx = —"/3 fe? dz = —"/ge? + C = —*/3e-** + C. 
Integrand of the form q’(x)/g(x). If the integrand is a quotient in which the numerator is the 
derivative of the denominator, one substitutes g(x) = z. Then y’(x) dx = dz ordx = ye , so 


that the integral becomes {= =Inz+C. 


For instance, the denominator (x) = x" + a has the derivative y’(x) = nx""!, so that if the 


nx"- 1 : 
1 — - , one may take the constant factor outside the integral, to give the 


numerator is x"~ 


integrand the form 9’(x)/9(x). 


4 
Examples: ‘is las 


2. ([aizar= Ie [S dx = "/, In [e(x* — 5)] = 1/4 In |x* — 5] + C. 


dx = In [e(x? — 4x + 7)] = In |x? — 4x + 7/4 C. 


ee one — ss ee" 2 
3. (==> dx =3 [— | Te OF = SF arctan» /2in(Qli+ x*)+C. 


1 1 
V(x? + a2) ~ D 
1 (x+D)  (+D/D 14+x/D — 1+ x/V(x? +7) 


The integrand can be rearranged into the form ’(x)/y(x) as follows: 


D Dx+D) #«wtD) #£x+D x + V(x? + a?) 
Now the numerator is the derivative of the denominator. 


Integrals of the function tan x, cot x, tanh x and coth x. Each of these integrands can be written 
as a quotient in which the numerator is the derivative of the denominator, and can therefore be 


: : —sin 
integrated by the method described. For example, | tanx dx = — | —— = dx = —In |cos x|+ C. 
Integrals of the functions arctan x, arccot x, tanh! x and coth! x. For these integrands, partial 


y (x) 


dx; for example, if arctan x is written 


integration leads first to an integral of the form {=e 


458 20. Integral calculus 


2x 
as a product | - arctan x, then u = arctan x, i = 1,sothat v = x,u’ = ——; , ou’ = 1/, -———— 
1 + x? 1 


a 


and f arctan x dx = x arctanx — 1/, | >; dx = x arctan x — */, In(1 + x?) + C. The other 


1 = T+ x2 
three functions can be integrated similarly. 


Integrand of the form f [y(x)] - y’(x), with y’(x) + 0. If the integrand can be written as the product 
of a function (x) and its derivative g’(x), then the substitution (x) = = z, p(x) dx = dz also leads 
to an easily integrated function of z and to the easdalet J Zz ea = Z ee + C. More ecm if the 


integrand has the form f[p(x)] - y’(x), the substitution g(x) = z, dx = , leads to an integrand 


f(z), so that if f(z) has a known integral (for example, f(z) = z"), the integral of f[g~(x)] - y’(x) can 
be found. 


Examples: 1. { sin x cos x dx = f sin x d(sin x) = f zdz = 27/2 4+ C = !/, sin? x + C. 
2, [= dx = [In x d(In.x) = [zdz = [In xP?/2 + C. 


3 
3: ae — ~ dx = | arctan’ x d(arctan x) = fe dz = 7°/6 + C= 1/, arctan® x + C. 


4S — a? x3 dx = —1/4 fl — x4)? (—4x3 dx) = —1/, fl — x4)? al — x4 
se sheer pes ~ — 49/324 igh Seals , 


5. {= S xdx x dx i d(x? + 1) 
+ DV +I) SVG ~ ='h [air = ~ hyp 
= "fz [299 de= "fp 224? + C= 1/2? +1) $C. 


Integrals of the functions arcsin x, arccos x, sinh~! x and cosh~ x. For these integrals, integration 
by parts leads to integrals of the form f y(x) g’(x) dx, which have the value ?/2, apart from the 
constant of integration. For instance, writing arcsin x as a product 1|-arcsin x, that is, taking 
u=arcsinx, v = 1,v=x, 
v= : vue = = —1/ Gaees soni a) integration by parts gives 

Va — x?) ’ Va—x)  ?\ya— 
—x dx 
(1 — x?) 
= x arcsin x + 1/, [S- =xarcsinx+ ~z+C 


= xarcsinx +/V(1— x7)+C. The _ other 
three functions can be treated similarly. 


i arcsin x dx = x arcsin x + fay y 


The substitution of a new variable by x = ¢(z). This substitution converts the integrand f(x) into 
the composite function f[p(z)]; the differentials are related by dx = ¢’(z) dz, and the integral takes 
the form J f(x) dx = f f [p(z)] gy’ (z) dz. For a definite integral one must, however, also convert the 
limits of the x-interval of integration into the corresponding limits of z-integration by means of the 
inverse function z = p(x). The function ¢(z) must have an inverse function and must be differentiable, 
with y’(z) + 0. The substitution x = |a| z, dx = |a| dx or x = (|a|/b) z, dx = (|a|/b) dz lead to the 
following integrals: 


Examples: 1. To remove the square root |x in the following integral one puts x = g(z) = z*, 
dx = 2z dz. Integration by parts leads to evaluation of the resulting integral in z; using the inverse 
function z = p(x) = |x to x = g(z), one obtains 


fev= dx = fet-2zdz=2f zet dz = 2ex(z— 1) + C=2e%*(x—1) + C. 


20.2. The indefinite integral 459 
4 
Z. | evx dx = 2[e(z — 1)]? = 2 e?; the new limits of integration are obtained from z = +-)/x 
1 
for x, = 1 and x, = 4. 
“4 - Substitution: 2x = z, dx = dz/2; 
3. { sin 2x dx = (1/2) | sin z dz = —(1/2) [cos z]*% 4 —an/2=—aA, 22 = 20. 


m/z a 
= —(1/2)(1 + 1) = —1. 


v3 5x | as Substitution: 4 — x? = z, 
4. | 5 ét= —(5/2) | = —(5/2) In z |} — 2xdx = dz; 
J T= J% Ons, 8, 
= —(5/2) (In 1 — In 4) = (5/2) In4 = $ In 2. pk 
. | I Substitution: z= In x, dv=.x dz: 
5. f = fe | x; = Vez, = 1/2, 
‘4 ¥ Vinx —Inx)} } Ved — 2)) ¥,=e>r=—1. 2=—sin* 4, 


dz = 2sinucos u du; 


z lj2—~ wu, = 2/4, 
du = 2[uje/2 = 2(n/2 — 2/4) = 2/2. ye “ie U2 = my 


aj/2 
“a 2 sin u COS u 
SiN U COS 
n/d 


Classes of elementarily integrable functions 


Integrable functions f(x), such as x”, sin x, e*, whose indefinite integrals can be expressed in 
closed form in terms of elementary functions are called elementarily integrable (or integrable in 
elementary terms). The following section describes the most important types of elementarily integrable 
functions together with methods of finding their indefinite integrals. 


Rational functions R(x), partial fractions. Every rational function is elementarily integrable: 
Since every power of x has an elementary integral, so does any polynomial. Rational functions 
that are not polynomials can be written as a sum of partial fractions (see Chapter 5.) and can 
then be integrated because for all natural numbers k > 1 each of the following fractions can be 
integrated: 


A Ax +B Ax+B 
x— x," (x—x)F’ x? +pxtq’ Xx? + px+q)’ 


The integrals of the first two expressions are standard integrals; the numerators of the last two 
can always be written as asum Ax + B= (2x + p) A/2 + (B— Ap/2). The first term leads to an 


with p? < 4q and A + 0. 


integral of the form i a dx, which is elementary; the second term is a constant which can 
be taken outside the integral. It remains only to show that the integral a can be 
found elementarily for k = 1, 2,... (x* + px + q) 


For k = 1, the denominator x? + px-+q can be written as (x + p/2)? + (q — p?/4) by complet- 
ing the square. The substitution x + p/2 = V(q — p?/4) u, dx = V/(q — p?/4) du gives 


: [ ee = —— arctan u 
Ju?+1  V(q—p?/4) 


V(q — p?/4) 


For & > | one can get a recurrence formula of the type 
x Cyx + C2 dx 


d 
\ tbe ber 7 ot +) et ort 


460 20. Integral calculus 


by finding the undetermined coefficients c;, C2, c3 as follows: one differentiates both sides, then 
clears fractions by multiplying through by (x? + px + q)* and so obtains an identity 
1 = —(k — 1) (e,x + cz) (2x + p) + (er + €3) (x? + px + Q); 

now one equates coefficients : 

Coefficients of x7: —2c,(k — 1) + ¢, +.¢3 = 

Coefficients of x: —20,(k — 1) — c,p(k — 1) i (c; + c3)p = 0. 

Constant terms: —c2p(k — 1) + (ec; + ¢3)q = 1. 
2 P _ 2(2k — 3) 


THIS BIVES C1 = Ey ig — pA)? 2 — EI) Gg — PY’ EGP) 


These results are often also displayed for the denominator (ax? + bx + c)*. 


Examples: (the decomposition into partial fractions is assumed to have been carried out) 


xP 05 ep de fae 
1. {eae 2S 3/—-> ee 


Pen 7 cm) fl 


= 2 In |x + 1] — 3 In |x — 2| + SIn |x — 5] + C=In ~—2 


|+c. 


2. | eee @ a l[s = hue berg ar +4| aaa DS Re 


Staining te ~ ined) 4-Cm oS! "we a\(= 3) +¢ 
3 (>See 3 aa cere he 
edi ack Kuda oes —-¢ 
= In |\(x — 3) (x? — 2x +5)|-+ arctan = bie: 
0 | PRES bye =f pe fst pees J gotta 
= —2in fx +1] + In fx? + x41] 5-3 arctan = — =A — 


ieee Ss ee te oe Se 2e+1 | 2x5 


Integrals of functions R[x, V(ax + b)] or R[x, Vi(ax + b)/(cx + d)]]. The integrands are rational 
functions of x and of (ax + 5b) or /[(ax + 5)/(cx + d)], that is, they can be obtained from x 


20.2. The indefinite integral 461 


and V(ax + b) or Y [(ax + b)/(ex + d)) by finitely many additions, subtractions, multiplications 


and divisions. The substitutions z = V(ax + b) for which x = (z" — b)/a, or z = y[(ax + b)/(ex-+ d)] 
for which x = (b — dz")/(cz" — a), lead to rational functions of z which, as has just been 
shown, can be integrated elementarily. For special forms R the integration can be simplified by 
using particular devices. 


Integrals of functions R[x, V (ax? + bx + c)j. Various substitutions, depending on the nature 
of the coefficients a, b, and c, lead to a rational function in a new variable Z. 


1. If a and ax? + bx + c are positive, and b? + 4ac, one makes the substitution (ax? + bx +c) 
=x Va +z. 

2. If c is non-negative, one puts Vax? + bx + c)=xz+ Ye. 

3. If a is negative, and if the equation ax? + bx + c = 0 has two distinct real roots x, and x2, 
one makes the substitution (ax? + bx + c) = z(x — x). 


There exist tables of integrals for special functions R. 


Another method of integration consists in complet- 
ing the square in the integrand 
ax? + bx + c= al[x + b/(2a)}? + (4ac — b)/(4a")), 
so as to bring it into one of the forms displayed in 
the adjacent table, and then using the displayed 
substitutions to obtain rational expressions involving 
trigonometric or hyperbolic functions. 
Examples: 1. The substitution x = sin z, dx = cos zdz, z= arcsin x yields 
f VQ — x?) dx = f (1 — sin? z) cos z dz = J cos? z dz = (1/2) (z + sin zcos z) + C 
= (1/2) (arcsin x + x V(1 — x?)) + C. 
2. From x = r sin z, dx = rcos z dz one obtains similarly 
J V@r? — x?) dx = (1/2) (r? aresin x/r + x V(r? — x?)) + C. 


Integrals of functions R(sin x, cos x, tan x, cot x). Transformation into a rational function of z 
can be achieved by the substitution z = tan (x/2): 


; ; 2 tan (x/2) cos? (x/2) 2z 
sin x = 2 sin (x/2) COs (x/2) = “sin?(x/2) -- cos? (x/2)_ = “Loe P 
__ cos? (x/2) — sin? (x/2) | 1— 2? _  2tan(x/2) sz 
COS x Cos? (x/2) + sin? (x/2) 1 +2?’ cata 1 — tan? (x/2) 1 — z?’ 
= — z? dz 1 1 — z? x = 2 


dx  2cos? (x/2) : 


cot x = 


462 20. Integral calculus 


1+ 
2z(1 er) shes [= = In (cz) = In (c- tan x/2). 
. 1 —sinx : fi — 22/(1 + z*))- (2/c ce 72 )] 
: ie 


Sin x(T = 008 x) *“ J) Pa+2)n—-a—-2za+2) 


= (1/2) { = ae =c1/2) [fe — 2/2? + 1/2) d= (4/2) On |2| + 2/2 — 1/0221 + 
= (1/2) [In |tan oy +- 2 cot (x/2) — (1/2) cot? (x/2)] + C. 


Integrals of functions R(sinh x, cosh x, tanh x, coth x). According to the definitions of the hyper- 
bolic functions, these integrals may be converted into integrals of rational functions by means of 
the substitution e* = ¢, for example, sinh x = [ft — 1 /t]/2. By analogy with the trigonometric case 
the substitution z = tanh (x/2) is also successful: 

2 tanh (x/2) cosh? (x/2) _—.2z 
cosh? (x/2) — sinh? (x/2) 1 — 2?’ 
sinh?(x/2) + cosh? (x/2) 1+ 2? 


Examples: I. i so 
sin x 


dz 


sinh x = 2 sinh (x/2) cosh (x/2) = 


a cosh? (x/2) — sinh? (x/2) ~ J—2z’ 
__ 2tanh(@/2) sz 142? 
os “Tt anh Gy) ie ee 
dz 1 __ cosh? (x/2) — sinh? (x/2) _ _ LZ dx 2 
dx _ 2cosh? (x/2) 2 cosh? (x/2) 2 ° dz 1—2z?° 


Binomial integrals { x(a + bx")? dx. Here the coefficients a and b are real numbers and the 
exponents m, n and p are rational numbers. A theorem of P. L. CHEBYSHEv (1821-1894) states that 
these integrals can be expressed as elementary functions when at least one of the numbers p, (m-- 1)/n, 
or (m+ 1)/n + p is an integer. If p is an integer, the integrand is a sum of powers with rational 


exponents which can be integrated. If (7m -+ 1)/n is an integer and p = s/r, one puts z = Va + bx"); 
if (m+ 1)/n + p is an integer, one puts z = /[(a + bx")/x"]. 


Integrals that cannot be expressed in terms of elementary functions 


The calculation of the length of an elliptical arc, of the period of oscillation of a circular pendulum, 
and of other problems lead to elliptic integrals. These are integrals whose integrand contains the 
square root of a cubic or quartic polynomial with no repeated root. 


Joseph LIOUVILLE (1809-1882) proved that they belong to the class of those integrals that cannot 
be expressed in closed form in terms of elementary functions. There are other integrals of this type 


1 sin x 
V(cosa —cosx)” x ‘Vda + alin 


does not mean that these integrals do not exist: as indefinite integrals of continuous ae they 
are, as has been shown, differentiable functions of the upper limit of integration. On the contrary, 
integrals that cannot be expressed by elementary functions are accepted into mathematics as new, 
higher, non-elementary functions. They are often treated by first expanding the integrand as an 
infinite series, which is then integrated term by term (see Chapter 21.). 


with comparatively simple integrands, such as 


oo 
If the infinite series 3? f,(x) converges uniformly on the interval a = x = b and if each term f,(x) 


n=() 
is integrable, then the series obtained by integrating termwise over [a, 6] also converges, 


20.3. Integration of functions of several variables 


Since the definite integral is particularly useful for calculating areas of plane regions, it is natural 
to look for a generalization to facilitate the calculations of volumes of spatial regions. If a bounded 
continuous function z = f(x,, x2, ---, X,) is defined on a measurable bounded region G in n-dimen- 


20.3. Integration of functions of several variables 463 


sional space, one divides G up into a finite number of measurable subsets and forms, just as in the 
definition of a simple definite integral, upper and lower sums involving the volumes of these subsets 
and the maxima and minima of f(x;, x2, .--, X,) in each of the subsets. If these sums approach 
the same limit as the subdivision is refined, this limit is called the n-fold volume integral of f over G. 
The two-fold volume integral, called the double integral, will be discussed in greater detail; it can 
be used to calculate the volume of solid bodies that are bounded by curved surfaces. However, the 
range of integration of an n-fold integral can be restricted to a manifold of lower dimension. For 
example, one speaks of a line integral for n = 3 when this manifold is a 1-dimensional curve, or a 
surface integral when it is a 2-dimensional surface. 


Two-dimensional integrals 


Double integral. The definite integral was defined as a limit of sums in which each term is the 
product of two factors, the lengths 4x; of subintervals and the ordinates f(&;) at a point &; of the 
subinterval, the number 7 of subintervals tending to infinity and the length of the longest subinterval 
tending to zero. The interval [a, b] of integration on the x-axis is now replaced by a plane region G 
on which a function z = f(x, y) is defined, and G is divided into n subregions 4G,, i = 1, 2, ..., 7. 
To simplify the notation, one writes 4G, for the subregion and also for its area. 

Suppose that the function is continuous and bounded in the region G. Then one can form lower 
sums with the infimum m, in AG;, and upper sums with the supremum M, in AG;. If the subdivision 
of G is refined, then as n > co and AG; — 0, the sequence of lower sums tends to the same limit as 


n 
the sequence of upper sums, and any sequence of intermediate sums » f(&;,7,) AG; tends to the 
i=1 


same limit whatever intermediate point (&;,7;) is chosen in AG;. The integral of the function 
z = f(x, y) over the region G is defined to be this common limit and is called a double integral, 
because there are two variables of integration. 


The existence of such a double integral can also be guaranteed if the function z = f(x, y) is bounded 
and piecewise continuous in G. If the double integral exists, the function is said to be integrable 
over G. 

Geometrical interpretation of the double integral. The simple definite integral may be regarded 
as the area of a plane region below a curve. Similarly, the double integral of a continuous function 
of two variables may be interpreted as the volume below a surface z = f(x, y) (Fig.) provided that 
z = f(x, y) takes only positive values in G. The 4G; are elements of area in the x, y-plane which, 
after the limiting process, are denoted by dx dy in Cartesian coordinates and by rdr dg in polar 
coordinates. Every product m; AG, is the volume of a cylinder with base area AG, and height m,, 
and likewise for M; 4G, and the similar cylinder of height M;. For the volume V below the surface 


n 
z z= f(x,y) one therefore has >” m,AG,; < V 
— n i=1 
< »' M,AG;,. Refinement of the subdivision 
i=1 


leads to a monotonically increasing sequence 
of lower sums and monotonically decreasing 
sequence of upper sums; both sequences have the 
same limit because the sequence of differences of 
corresponding upper and lower sums tends to zero. 


Calculations of the double integral. A double 
integral may be calculated by making two suc- 
cessive integrations over each variable in turn. 
Suppose that the region G of integration has a 
simple boundary which meets the rectangle 
y ay<x<a,,b; < y < bz at the points A,, Az, 
B,, Bz (Fig.). The points A,, Az separate the 
boundary curve of G into two arcs: A,B,A2, 
which is the graph of a function y = y,(x), and 
A,B,A2, the graph of y = y2(x). Similarly B,, 
B, separate the boundary curve into B, A,B, and 


20.3-1 Volume under the surface z = f(x, y) B,A2B, , given by x = x;(y) and x = x2(y). For 
above G fixed x = ,, y,(&;) and y2(é,) are end-points of 


464 20. Integral calculus 


an interval y,(é,) < y < y2(€,) over which the function 
S(€:,¥) of the single variable y must be integrated; 
( 


V3XE j 
p(§,) = { f(é,, y) dy is, for fixed x = &,, a constant which 
yy Dd 


depends on x, that is, a function g(x) of x on the interval 
a; <x <a,. With suitable assumptions about the bound- 
ary curve of G, (x) is a continuous function of x and 
is therefore integrable over the interval [a,, a2]. Similarly 
X3(n i) 
y(n.) = | f(x, n,) dx is, for fixed y=n,, a constant which 
X3( i) 
depends on y and as a continuous function y(y) on the 
interval 6, < y < bp, is integrable over this interval. 
It can be shown that both repeated integrations lead to 
the same value, which is also equal to the value of the 
double integral; this agrees with the geometrical idea that the g(x) are areas of plane sections 
parallel to the y, z-plane, and the y(y) areas of the plane sections parallel to the x, z-plane, of the 
same solid body lying below the surface z = f(x, y) and above the plane region G. 


20.3-2 Decomposition of the boundary 
of the region G of integration 


For a function ©(r, g) given in polar coordinates x = r cosy, y = rsing, the element of area dG 
takes the form dG = rdrdg, as follows from calculating the Jacobian (see Transformation of 
multiple integrals). 


Ox dy 
‘Or or cos p sin » 
dx dy | |—rsing+rcosg| — | 


Example J]: In the double integral Sf@ + y) dG, let G be the region between the lines x = 0, 
G 


y = land x + » = 3 (Fig.). For fixed x, the y-integration goes from the constant limit y,; = 1 
up to the variable limit y, = 3 — x 


(x) =f “(x +») dy = [xy + y?/213-* = x3 — x) + (3 — x)?/2 — (x + 1/2) = 4 — x — x?/2. 
1 


This function g(x) must now be integrated in the x-direction; the limits of integration are 
x = a, = 0 and x = az = 2, and the value of the integral is 

2 2 3-x 
f(4 — x — x?/2) dx = [4x — x?/2— x3/6]§ = 8 — 2 — 4/3 = 14/3; J J (x + y) dy dx = 14/3. 


u 


20.3-4 Region of 
integration G between the 
curves (x — 1)? = 2¥ 

and y = 2 


20.3-3 Region G between 
the lines x = 0, y = 1 
andx +y=3 


Example 2: In the double integral {{ xy dG, the region G of integration is enclosed by the curves 
G 


(x — 1)? = 2y and » = 2, which meet at the points P,(—1, 2) and P,(3, 2). The boundary of G 
is therefore given by the functions y,; = 2, y. = (x — 1)?/2 or x = 1 + y/(2y) (Fig.). 
The calculation is simpler if the x-integration is performed first. 


OY) 37 x = Det ALIS = "aril + VON? — UL — YONP) = 229?) 


2 2 i+5/ (2) 
2 V2.f Viv) dy = 2 V20/sy? V918 = 32/5; fs 1 xy dx dy = 32/5. 


20.3. Integration of functions of several variables 465 


The calculation for the opposite order of integrations is 
2 
aa Pika das as [xy?/2]%.—j2 = 2x — (x — 1)* + x/8 
= —x5/8 + x*/2 — 3x3/4 + x 7/2 + 15x/8 = g(x), 


j (x) dx 


= [—x°/48 + x5/10 — 3x*/16 + x3/6 + 15x7/16}?, 
= 32/5. 

Example 3: A vertical cylinder is erected above the 
ellipse x?/a* + y?/b? = 1 in the x, y-plane and is cut 
off obliquely by the plane z = f(x, ) =mx-+ny+e, 
where c is so large that the plane z = f(x, y) cuts the 
x, y-plane in a line outside the ellipse (Fig.). The volume 
of this truncated cylinder is given by the double inte- 
gral ff (nx +- ny -+- c) dG, where the region G is boun- 


ded by the ellipse y = --(b/a) (a? — x?). Then 
i, +a bee eee 


= mx-+n c)dy| dx 
a —(b/a)./ (a* a Bet: | 


= F tay + my? /2 + enltlavigi-nd ax 


20.3-5  Obliquely truncated elliptic cylinder 


= f 2(b/a) (mx + ¢) V(a? — x7) dx 


= 2(b/a) fm f x V(a? — x2) dx + ¢ f Va? — x?) dx]. 


The first integral with the factor m outside is zero (one may, for instance, use the substitution 
a? — x* = z, —2x dx = dz), and the substitution x = a sin z, dx = a cos z dz leads to the value 
a*cn/2 for the second integral. Hence the volume V has the value V = abex. 


Multiple integrals. Just as integration of functions of two variables leads to double integrals, so 
the integration of functions of three or more variables leads to triple or multiple integrals. If one 
considers a function of three variables, defined in a bounded three-dimensional region R, and sub- 


divides R into parts AR;, one can again form lower sums z m,4R,, upper sums Z M, 4R;, and 
=1 
intermediate sums p> F(Ei, 11, 6;) AR;. Here m, is the finan and M, the supremum of the function f 


in the subregion AR, and f(&;, 7;, ;) is a function value at some point in 4R,. If the sequences of 
lower and upper sums tend to a common limiting value as n> oo and 4R; > 0, then so do the 
intermediate sums, and the common limit is defined to be the triple integral of the function f(x, y, z) 
over the region R. 


Any function f that is bounded and continuous in R is integrable in this sense. Whereas the region 
of integration can still be thought of geometrically as a region in space, a geometrical interpretation 
of the integral is no longer possible; in the contexts of mechanics it could be interpreted as the total 
mass of the region if f(x, y, z) were the density at (x, y, z) in the region R. For integrals of functions 
of more than three variables, defined analogously, even the region of integration no longer has a 
direct geometrical meaning. Just like double integrals, so multiple integrals may be calculated, under 
suitable assumptions about the region of integration, by the appropriate number of successive 
integrations over each of the variables; the limits of integration depend on the nature of the boundary 


of R. 
many cases it is convenient not to use rectan- 


mn nan TUTTI er 2e]o jer gular (Cartesian) coordinates to describe 


the region R, but other coordinate systems. The most usual ones, depending on the particular 
problem being studied, are cylindrical and spherical polar coordinates. 


Transformation of multiple integrals. In 


466 20. Integral calculus 


The figure displays the volume element 4R for cylindrical and spherical polar coordinates. To 
derive the volume element for an arbitrary coordinate system, the following theorem is quoted: 


If the rectangular coordinates x = x(u, v, w), y = y(u, v, w), Z = z(u, v, w) are one-to-one con- 
tinuously differentiable functions of the coordinates u, v, w, then the volume element dX is multi- 
plied by the absolute value of the functional determinant (Jacobian) D(u, v, w) displayed below. 


Du,» = |-5 ao 0 | 


20.3-6 Element of volume 
a) in cylindrical, 
b) in special polar coordinates 


ad 


That is, dR = dx dy dz = |D| du dv dw. For cylindrical polar coordinates x = r cosy, y=rsin 9, 
z= z, and for spherical polar coordinates x = rsin?cosg, y=rsinédsing, z= rcos%, the 
corresponding determinants D, and D, turn out to be 


COs Y sing O sin@?cosg  sindsing cos # 
D,=|—rsing +rcosy O|, D,=j rcos@cospm rcos#sing —rsin®|, 
0 Oo 1 —rsin®?sing ,rsindcosy 0 


that is, D. = r and D, =r? sin 8, so that dR becomes r dr dg dz and r? sin 6 dr dé dg, respectively. 


Cubature 


Multiple integrals have important applications in the calculation of the volume V of a solid body B. 
It has already been observed that the double integral {f{ f(x, y) dx dy represents the volume of a 
G 


cylinder with base G and upper boundary z = f(x, y). It follows that a cylindrical body with 
cross-section G and upper and lower boundaries z = f(x, y), z = f2(x, y) has the volume V = 
ff fix, y) — fa(x, y)] dx dy. In this way one can find the volume of any body that can be pieced 
G 


together from a finite number of such cylindrical bodies; most bodies B occurring in practical appli- 
cations. are of this form. The volume V of Bcan also be expressed as the triple integral [{f dV, where 
B 


the shape of the boundary surface of B determines the limits of integration; in particular, if one inserts 
the limits for z and performs the z-integration, one is left with a double integral of the type discussed 
earlier. A further method of calculating the volume of B, if B has a piecewise smooth boundary 
surface, is furnished by the integral theorem of Gauss. This formula states that 


V=fifqv= ‘/s [J rads, 


where 0B is the boundary of B, dS is an element of surface, # is the outward normal, andr = xi+ yj 
-+- zk is the vector field; this reduces the calculation of the volume to the evaluation of a surface 


integral. 


Calculation of volume from areas of cross-sections. Suppose that the solid is referred to Cartesian 
coordinates x, y, z and lies between the two planes x = a, x = b perpendicular to the x-axis. Sup- 
pose also that the areas of cross-sections of the body by planes perpendicular to the x-axis are known, 
and are given by a continuous function g(x). One may then think of the body as being made of 
slices of thickness 4x, (Fig.). In each slice there is a smallest cross-sectional area, g;, and a largest, 
Q;; the volume V, of the ith slice lies between that of a cylinder of height 4x, and base area q;, and 
of a similar cylinder of base areaQ,. Just as for areas one obtains, as approximations to the volume V, 
lower sums v(m) and upper sums V (7) 


20.3. Integration of functions of several variables 467 


v(n) = XU Ax <VE< X21 Ax, = V(n), 


which have the same li- 
miting value as n— oo 
and Ax,— 0. Hence the 
volume V may be repre- 
sented as a definite integral. 


Cavalieri’s principle. If another body has 
cross-sectional areas g(x) which in [a, 5] for 
each x are the same as q(x), so that q(x) 
= q(x), then the volumes V and V of the 
bodies are equal. This principle was formu- 
lated by CAVALIERI before the methods of 
the integral calculus had been developed. 


Two bodies lying between two parallel 
planes have the same volume if their cross- 
sections by any plane parallel to these planes 
have equal areas. 


Volume of a solid of revolution. Solids 

with certain symmetry properties can often 
be regarded as having boundaries that are 
generated by the rotation of a curve; for instance, the surface of a sphere is obtained by rotating 
a semicircle about its diameter. Such a body is called a solid of revolution. If its surface is obtained 
by rotating the continuous curve y = f(x) about the x-axis, or x = y(y) about the y-axis, then 
the cross-sections by planes perpendicular to the axis are circular regions with areas g(x) = a[f(x)]? 
or g(y) = x[¢(y)]*. If the solid of revolution is bounded by planes x = x, and x =x, or 
y = y; and y = yz, the formulae displayed below give their volumes. 
If the solid is obtained by rotation of a 
continuous curve that consists of several 
arcs joined together, it is best to add up 
the volumes of the individual pieces. It 
may also be appropriate, just as in the 
calculation of the area between two cur- 
ves, to integrate the difference of the 
squares of two suitable functions 


1 Steer — [h(x)]?} dx. 


20.3-7 Cubature of a solid 


Example I: The curve y = f(x) = x?/36 is rotated between the limits x, = 0 and x, = 12 
a) about the x-axis, b) about the y-axis (Fig.). The volumes of the corresponding solids are to be 
calculated: 

xX, 12 
a) V, =2{ [f())* dx = mJ (x*/1296) dx = 1922/5 ~ 120.6. 
b) Using x = 9(y) = 6 Vy and y, = f(0) = 0, yz = f(12) = 4, one obtains 


4 
V, = 2 J too)? dy = 2 f 36y dy = 288 908.4. 
Fi 


20.3-8 Rotation about x-axis and about y-axis 20.3-9 Barrel as paraboloid of rotation 


468 20. Integral calculus 


Example 2: The surface of a barrel is described by rotating the parabola y = ax? + c, between 
two limits, about the x-axis. The length of the barrel is 1 m, the diameter of each end is 60cm, 
the largest diameter is 80 cm (Fig.). 

The constants in the equation of the parabola and the limits of integration can be deduced from 
the given measurements: y = x7/25 — 4, x, = —5, x2 = 5. This gives the volume 

5 5 
Vy. = m J (x*/25* — 8x?/25 + 16) dx = dn [(x*/25* — 8x?/25 + 16) dx ~ 425.2. 


The barrel holds 425.2 1. 

Example 3: The parabola y? = 2px cuts the circle y? = r? — (x — c)? in points with abscissae 
x, and x, (Fig.). Rotation about the x-axis makes the blue region in the figure generate a parabolic 
spherical ring of height A = x, — x,. Its volume is calculated from 


= f 2px dx — nf [r? — (rf —c}*) dx = nf [px —r* + (x — c)?) dx. 
xy xy a 
Since the integrand vanishes at x, and x,, and since x* has the coefficient i, one can put 
2px — r? + (x — cc)? = (x — x,) (x — x2), so that V.=2 fx — x,)(x — x2) dx. The sub- 


x 
stitution x — x, = ¢ then leads to V, =a [ t(h — tr) dt = ah3/6; this result is the same as that 
for the cylindrical spherical ring. 


20.3-10 Parabolic spherical ring 


20.3-11 Rotation about the y-axis 


Example 4: The diagram (Fig.) indicates a steel cylinder into 
which a hole has been bored by rotating the curve y = e**7! 


Fa 
about the y-axis. The volume of the hole is V, = 2 { x* dy, the 


¥i 
limits of integration are obtained from the measurements given 
in the diagram as y,; = 1, yz = 10. By solving the equation for x 
in terms of y, and squaring, one gets 


x? = (1/4) (In? y + 21n y + 1). 
Hence the required volume is 
10 
V,= (7/4) f (In? y+ 2Iny + 1) dy = (2/4) [y In? y + y]}° 
= 48.73. 


Are length and surface area 


Rectification is the calculation of the length of an arc of a curve, and complanation that of the 
area of a curved surface. 


Arc length. Although it may seem intuitively obvious that there is such a thing as the length of 
a curved path, a mathematically precise definition is needed. One considers a curved arc whose 
equation is y = f (x), a< x <b, where the function f is continuously differentiable. One divides 
this arc into n pieces at the points Po, P;,..., P, and compares the curved arc with the polygon 
PoP, ... P,, which may be expected to approximate well to the curve when n is large. If the division 
points P;, i=0,1,...,m”, have the coordinates (x;, y;), then the length /, of the chord P,_;P, is 
given by J, = VI(Ax,)* + (Ay,)?] = Ax, y/| 1+ 3 , and the length of the polygon is 

i 


Sn = 2 1+ (= Ax,. By the mean value theorem of the differential calculus there exists 
a 8 


20.3. Integration of functions of several variables 469 


a place &, in the interval x,_; < x < x; such that the derivative f’(&;) is equal to the differencc 
quotient Ay,/Ax;, so that J; = Ax, VU + Lf’(é,))]*) (Fig.). Each subdivision of the interval from 
Xo = ato x, = b produces a polygon of length s,. If one refines the subdivision so that the number 7 
of subintervals tends to infinity and the length 4x, of the longest 
subinterval tends to zero, then the sequence s, of lengths of in- 


b 
scribed polygons has a limiting value s = f Y(1 + [f’(x)]*) dx 


a 
because of the continuity of f(x) and the definition of the de- 
finite integral as a limit of sums. 

A continuous curve is called rectifiable if the lengths s, of 
inscribed polygons remain bounded above for all possible sub- 
divisions of [a, b], and then the supremum s of such lengths is 
called the arc length of the curve; it can also be shown that s is 
the limit of the s, as the subdivision is refined, that is, 

s= lim Syp,. 20.3-12 Length of a plane curve 
Ax;>0 
The above derivation has proved that a curve is rectifiable if » = f(x) is continuously differentiable 
in [a, b]. If the continuous curve is rectifiable, but y = f(x) is not continuously differentiable in 


Element of length. For a fixed lower limit and a variable upper 
limit, the length of a curve is a function of the upper limit, s(x) 
x 


= f Vl + y’”) dx. The differential ds of this function is often called 


element of length of the curve, so that the length is the integral of 
the element of length (Fig.). 


20.3-13 Element of length 


Example 1: For the circumference of a circle of radius r, one obtains y = V(r? — x); 
y = —x/\/(r? — x*); 1 + yp’? = r*/(r* — x*). Therefore the circumference C is given by 
F 1 4 
r ° if 
= a = —— SO ir I i 4 r nt 
C 4| Ny ar | Ta = ArtAresin tj = 2ar 
0 0 


Example 2: For the cycloid in parametric form x = a(t — sinf), y = a(1 — cost) one has 
x = a(1 — cos fr), » = asint so that the element of length is 


ds = |/(x? + §?) dt = /[a*(1 — cos t)? + a? sin? £) dt 
=a V(1 — 2cos f+ cos? t+ sin? t)d¢ =a 2: ¥(1 — cos r) dr 
=a V2 V(2 sin* t/2) dt = 2a sin ¢/2 dr. 
This gives arc length 


at at 
s= f (4? + 7) dt = 2a J sin t/2 dt = —4a[cos t/2]2* = 8a. 


The length of a full arc of the pointed cycloid is therefore four times the diameter of the rolling 
circle generating the cycloid. 


Surface area. The area of a curved surface is given by a surface integral. In what follows, a formula 
for the area of a surface of revolution will be derived. Let P, and P, be points on a continuously 
differentiable curve y = f(x), corresponding to x, = a and x, = b; then rotation about the x-axis 
makes the arc of the curve from P, to P, describe a surface of revolution. If, as in the discussion 


470 20. Integral calculus 


of length, the arc is replaced by a polygon with m — 1 sides, then the corresponding surface of 
revolution is a sum of lateral surfaces of frustums of cones. The surface area of a typical such lateral 


surface is Ay, ]2 
o, = mf (xy) + f(%41)] VU(Ax,)? + (4y,)]}? = afr) + f£Orv41)) 4x, ae + rea é 


By the mean value theorem of the differential calculus there exists a &, in (x,, X¥41) for which 
f'&,) = 4 Yl Axy. ane sum of surface areas is therefore 


S, = 7 = Ute) + f%v40) VO + FE) 1 4x,. 


A refinement of the subdivision of the interval a < x < b leads to improved approximations and, 
as for the length of a curve, to a limiting value for the sum S,, which may be expressed as a definite 
integral. The factor 2 arises because both f(x,) and f(x,,,) occur in S,. Use of the element of length 


also leads to the formula S = 2x iba ds. 


Example: The formulae for the surface areas of a _Sphere, spherical cap, and belt of a sphere - 
because of y = y/(r? — x*); 1 + yp’? = r?/(r? — x?) are: 


Sphere: S = 2x J V(r? — x?) r/V/(r? — x?) dx = dar | dx = 4nr?. 
Spherical cap: S. = anf dx = 2ar(r —&)=2nrh with hA=r—& 


Spherical belt: S, = dar f ‘dx = 2ar(é, —&,)=2arh with h=&,— &,. 


Line and surface integrals 


Line integrals. In order to put such physical notions as work, potential, etc. into a mathematical 
form, it is appropriate to generalize the original concept of integral by considering limits of sums 
whose summands depend in a certain way on a curve, the path of integration. This leads to the 
concept of a line integral. 

Suppose that a smooth curve C in three-dimensional space is given in parametric form by the 
functions x = x(s), y = y(s) and z = 2(s) with continuous first derivatives. The parameter may, 
for example, be the arc length s. Further, let f(x, y, z) be a continuous function whose domain of 
definition includes the arc AB of the curve corresponding to the parameter interval o, <s< 03. 
Then to each s in the interval [c, , 02] corresponds a point P[x(s), y(s), z(s)] of the curve at which 
the function takes the value f[x(s), y(s), z(s)], so that one now has a function of the parameter s. 
If one subdivides the curve into n arcs or, what amounts to the same thing, divides [c;, o2] into 


n 
n subintervals As,;, and forms the sum >’ f[x(s;,), y(s,), 2(5;)] 4s;, where s; is an arbitrary parameter 
i=i 


value from the subinterval 4s;, one obtains a sequence of sums. If this sequence tends to a limit 
when the length of the largest subinterval approaches zero and the number n of subintervals increases 
to infinity, and if this limit does not depend on the choice of subdivisions and intermediate points 
s;, then this limit is called the line integral of the first kind of the function f(x, y, z) along the curve C 
from A to B. 


The calculation of the integral can be reduced to that of a definite integral. If x = x(t), »y = y(t), 
z = z(t) is any parametric representation of the curve C (the arc AB corresponding to the parameter 


interval ¢; < t < ft) then because &. == V[xX(t)? + (0)? + 2(t)7] one has: 


20.3. Integration of functions of several variables 471 


If P(x, y, z), O(x, y, z) and R(x, y, z) are continuous functions, one can define similarly other types 
of line integral J PGs y, z) dx, Jo y, z) dy, J RG, y, z) dz; for example, the first of these is the 


limit of sums Dy Pix;(s), vils), _ Ax,, where 4x, is the projection on the x-axis of the ith arc of 
i= 


subdivision of the curve. If these three integrals are added together, one obtains the line integral 
of the second kind 


J [P(x, y, z) dx + Q(x, y, z) dy + R(x, y, z) dz]. 


The calculation of such an integral in two dimensions can sometimes be simplified by applying the 
following theorem. 


If P(x, y) dx + Q(x, y) dy is the total differential dF(x, y) of a function F(x; y), and if P(x, y) 
and Q(x, y) are continuous in a connected region G, then the value of the line hutegral [ [P(x, y) dx 


+ Q(x, y) dy] depends only on the end-points A and B of the =e of integration in G and not on the 
particular path joining A to B. 


This follows from the fact that 
J [P(x, y) dx + Q(x, y) dy] = J dF(x, y) = f dF (x, y) = Fix(t1), ¥(41)] — FIx(t2), y(t2)] 


depends only on the limits of ateaeadon. An eauivalent Statement is that the line integral 
JP dx + Q dy] is zero for every closed curve C in the region G. 


The following theorem, which is easy to deduce from Gauss’s theorem (see Divergence and 
theorem of Gauss), provides a criterion for P dx + Q dy to be a total differential. 


If the region G in question is simply-connected and if the functions P(x, y) and Q(x, y) are con- 


tinuously differentiable in G, then the integrability condition ee = oa | is necessary and suf- 


ficient for P(x, y) dx + Q(x, y) dy to be a total differential. oy ox 


In Chapter 22. it will be shown that if Pdx + Q dy is 
not a total differential, then it is always possible to 
find an integrating factor u(x, y) such that the product 
u(x, y)(P dx + Qdy) is a total differential. 


Example: To calculate be dx + ydy) along the parabola y = x? from A(0,0) to B(2, 4), 


take x as a parameter. Then dy = 2x dx, and 
zZ z 
J(x dx + y dy) =I dx + x*-2n dx) = Jo +- 2x9) dx = [x?/Z + x*/2]§ = 
Cc 
dQ 


Since the integrability condition a satisfied, the integral is independent of the path of 
integration; integration, for example, along the curve »y = 4sin (2x/4) from A(0, 0) to A(2, 4) 
yields the same result as before. 


Surface integrals. Just as the line integral generalizes the simple definite integral, so surface 
integrals are the analogous generalization of the double integral over plane regions. Suppose that 
S is a smooth surface, bounded by a piecewise smooth curve, in three-dimensional space with 
Cartesian coordinates (x, y,z); here ‘smooth’ means that the tangent plane at interior points 
of S depends continuously on its point of contact. Let S have the parametric representation (see 
Chapter 26.) x = x(u, v), where the parameters u and v range over the region U = {u, <u< up, 
v, <v <vz}, and let f(x, y, z) be a continuous function defined on S. One divides S into small 
portions S,; formed from a network of smooth curves on S, chooses an arbitrary point P,(x;, ¥;, Z;) 


n 
in S;,and forms the sum Py f(X1i, Vis Z;) AS;, where 4S, is the surface area of S,. If this sum approaches 
=1 


a limiting value as n> oo and AS, > 0, independent of the choice of the points P;, this limit is 
called the surface integral of the function f(x, y, z) over the surface S and is denoted by f f(x, y, z) dS. 
S 


Calculation of a surface integral can be reduced to that of a double integral as follows: one inserts 
the parametric representation of the coordinates x, y, z into f(x, y, z); the surface element dS (see 


472 20. Integral calculus 


Chapter 26.) has the form 
dS = (EG — F?), where E=x,°x,, F=x,°xX,, G=x,°%X. 
Therefore 


J f(x, y, z) dS = J Fx, v), yu, v), z(u, v)] V((EG — F?) du dv. 


For f = 1, the integral yields the surface area of S. 
Surface integrals of the second kind are defined analogously to line integrals of the second kind. 


Applications in mechanics 


Work. The concept ‘work done by a force’ is defined in terms of a line integral of the second 
kind. The force F is a special vector field (see Vector analysis); let F,, Fy, F; be its components 
referred to a Cartesian (x, y, z)-coordinate system. If P(x, y, z), the pores “at which the force F is 
applice moves along a smooth path C, then 

= W = [Fs dx + F, dy + F, dz) is called the work done by the 


force F along the path C (Fig.). If the vector with components dx, 
dy, dz is denoted by dr, then the integral for work done can be 
written vectorially, using the scalar product F: dr. 


Work done is the line integral of the force. 


20.3-14 Indicator diagram of a steam engine 


For instance, if the force F is constant and the path C is a segment starting at the origin and repre- 
sented by a vector r of length |r|, then the angle g between the directions of F and r is also constant 
and the work integral leads to the formula W = |F|- |r| -cosq; in this case the work done 
is the product of the component of force |F| cos along the direction of the path and the path 
length |r|. In physical problems the components of force are usually the partial derivatives of a 
function V called the potential. Work done is then the line integral of a total differential and depends 
only on the end-points of the path. The field of force is then called conservative. 


Example 1: What is the work done in extending a spiral spring by / units of length if the force 
acts in the direction of the spring? — Denoting the spring constant by D, the force is F= Dx and 


i] 
the work done W = f Fdx = | x dx = D/?/2. 


Example 2: To accelerate a body of mass m from the velocity v, to the velocity v, requires 


work W; since F = m m=, one obtains 


v= fram [mgs ds=m [rdem ot — 00) m/2. 


Py 
The work done is equal to the increase in kinetic energy. 


Static moment. The static moment M of a point mass about an axis is defined to be the product 
of the distance / of the point mass from the axis and of the mass m. 


For the static moment dM of an element of mass dm of a 
continuous mass distribution, the differential expression dM 
= 1-dm holds, and the static moment of the entire mass is 
obtained by integration. If 9 is the constant density and dV 
the volume element, then dm=odV. To define the static 
moments about each of the coordinate axes for the region 
below the curve y = f(x), one simply calculates the moments 
20.3-15 Static moment of a region for a continuous mass distribution of density @ = 1 over a body 


20.3. Integration of functions of several variables 473 


of thickness d = 1 lying above the region (Fig.). To calculate the moment about the y-axis of the 
region under the curve y = f(x), one decomposes the region into strips of width 4x = dx. The 
area of the strip, by the mean value theorem of the integral calculus, is f(é) dx for some intermediate 
value € in 4x, and the moment of the strip is therefore dM = &f(&) dx. Integration then gives the 
moment of the region. To obtain the moment about the x-axis, one decomposes each of the above 
strips into elements of breadth Ay = dy. This element has the moment dM = 7 dy dx, where 7 


y 
is an intermediate value in Ay, and integration in the y-direction yields dM = fn dy dx for the 
0 


moment of the strip about the x-axis. Finally, integration over x yields the total moment of the 
region. 


Static moment about the y-axis of the region below the curve 
y = f(x) between a and b 


Static moment about the x-axis of the region beneath the curve 
| y = f(x) between a and b 


Similarly the static moment of a curve can be obtained by considering a uniform mass distribution 
of density 9 = 1 along the curve; the moments about the x-axis and the y-axis are 


b b 
M,=fyV+y%)dx and M,=fxy(l+y’”) dx, 


respectively. For a body of revolution about the x-axis, one can calculate the static moment with 
respect to the plane through the origin perpendicular to the x-axis as 


b 
M==znf xy? dx. 
a 


Centre of mass. Every solid body can be regarded as a system of point masses, and there is always 
one point, the centre of mass, at which one can imagine the entire mass of the body to be concentrated. 
The static moment of a continuous mass distribution is equal to the static moment of the centre of 
mass with respect to the same axis: M = {1 dm = I, m. By using the appropriate static moments, 


leas 
one can obtain the coordinates of the centre of mass for uniform distributions along a plane curve, 
over a plane region, and a solid of revolution: 


X,=M,/s and y.=M,/s; x,=M,/A and »y,.=M,/A; x, =M/[V. 
Coordinates (x., y.) of the centre of mass 
a) uniform distribution b) uniform distribution | ¢) uniform solid of revolu- 
on a plane curve below the curve y = f(x) tion about the x-axis 
fx VU +») dx 


= .—___—_. 
f VQ + y’*) dx 


fy VU + y) dx 


a eae oS 
f Vd + 7) dx 


For the solid of revolution about the x-axis, the centre of mass lies on the x-axis, that is, y, = z, = 0. 
Example: To calculate the coordinate of the centre of mass of the region below the curve 
it, 
y= (x)= cos x between 0 and 7/2. — The area is A = feos x dx = 1. The required integrals can 


nj2 /2 
be found after an integration by parts in each case: J x 0s x dx = 2/2 — I, ) cos* x dx = 2/4. 
The formulae for the coordinates of the centre of mass then lead to x, = 2/2 — 1, y. = 2/8. 
The formulae lead to Pappus’ rules. The generating region for the solid of revolution about the 
b 


x-axis has the static moment M, = (1/2) f y? dx and the ordinate of the centre of mass is y, = M,/A. 
a 


474 20. Integral calculus 


This gives, for the volume of the solid of revolution, the relation 
b b 
V,. =f y? dx = 2-1/2 f y? dx = 20M, = 2ny.A. 


Pappus’ rule for the volume of a solid of revolution: the volume is the product of the area revolved 
and the length of the path described by its centre of mass. 
The surface of the solid of revolution is generated by revolving the curve y = f(x) about the 
b 


x-axis; this curve has the static moment M, = f y (1 + »’*) dx and its centre of mass has the 
a 


b 
coordinate y, = M,/s. It follows that the surface area S, is given by S,=22f y V/(1+ y’?) dx 
= 22M, = 2ny,;5. a 


Pappus’ rule for the area of a surface of revolution: the surface area is the product of the length 
of the generating curve and the length of the path described by its centre of mass. 


Moment of inertia. The kinetic energy W of a body of mass M and velocity v is W = v?M/2. 
If a rigid body rotates about a fixed axis A, its various portions have different velocities. If # denotes 
the constant angular velocity and x the distance of the element of mass dm from the axis of rotation, 
then this element has the velocity v = xw and the kinetic energy dW = {l/2) x 2«9* dm. The kinetic 
energy of the whole body is obtained by integration as W = (1/2) ow? J x? x* dm, where the integration 
is taken over all elements of mass. 


20.3-16 Moment of inertia 


If one compares the two expressions for kinetic energy, one notices 
that the mass M has been replaced by the integral [ x? dm: this is 


Called the axial moment of inertia L, with respect to the axis ot rotation 
A (Fig.). If the moment of inertia is defined not with respect to a reference axis A but with respect 
to a reference point P, one obtains the polar moment of inertia Ip. 

An important relation between the polar moment of inertia Jp with respect to the origin O of a 
rectangular Cartesian coordinate system and the axial moments of inertia I, and I, with respect 
to the two coordinate axes can be obtained by using the relation r? = x? a y?, where r, x, y are 
the distances of a mass element from the origin and from the axes. Thus, 


ae adie ae die ade 


Example |: To find the moment of inertia of a thin straight rod of length /, cross-sectional 
area q and uniform density o, with respect to an axis passing through one end of the rod and at 
right angles to it (Fig. 20.3-16). If dx is an element of length of the rod, the corresponding element 

i) 


f 
of mass is dm=oqdx. Hence J, = jx dm = og dm = og {* dm = qol?/3 = MI?/3, 
since the total mass of the rod is M = ogi. 


Example 2: To find the moments of inertia of a thin circular plate of diameter d with respect 
to the centre and with respect to one of the lines through the centre (Fig. 20.3-16). To simplify 
matters, it is assumed the mass per unit area is |. First one calculates the polar moment of inertia J. 


The mass dm of the dark circular ring in the diagram is dm = 220 do, so that J, =S¢ dm 
= an J o° do = ar*/2 = nd*/32, For reasons of symmetry, J, = J, and therefore Ip = I, + J, 
= 2/,, 1, = Ip/2, so that the axial moments are /, = J, = 2d*/64. 


20.4. Vector analysis 475 


Steiner’s theorem. Let I, be the moment of inertia of a body about an axis C passing through 
the centre of mass. The moment of inertia I, about an axis A parallel to C and at a distance a 
from C is evidently 


I, =f & +4)? dm = f (x? + 2xa + a?) dm =I, + 2af xdm + a?m. 
m m m 
But x denotes distance from the axis C, so that the integral { x dm is the static moment about this 


m 
axis and is zero because the axis C passes through the centre of mass. 


Steiner's theorem. The moment of inertia /, of a body about an arbitrary axis is equal to its moment 
of inertia about the axis C through the centre of mass and parallel to A plus the product of its mass 
and the square of the distance from A to C. 


20.4. Vector analysis 


In vector analysis one considers vector-valued functions of one or several variables and applies 
the concepts and methods of the differential and integral calculus. Its applications lie mainly in 
the fields of mathematical physics and of differential geometry. 


Fields 


Scalar fields. A scalar function gy in space is called a scalar field if, in a given region, a scalar 
g(x, y, Z) = P(r) is assigned to each point P(x, y, z) with position vector r; for instance, temperature 
or density in a body are scalar fields. They can be visualized through the level surfaces (x, y, z) 
= const in space or the /evel curves (x, y) = const in the plane; for example, there are maps 
giving lines of constant height above sea level (contours) and lines of constant temperature (isotherms). 
The function g changes the more rapidly the closer the level surfaces or level curves are to each 
other. 


Example: The level surfaces of the scalar field p = x? + y? + z* = r? are spheres with centre 
at the origin. 

Vector fields. If a function v = 
v(x, y, Z) = o(r) assigns a vector v to 
each point in a region, then v = o(r) 
describes a vector field; for instance, 
fields of force F(r) or electric fields E(r) 
are vector fields. These can be visualized 
by attaching arrows to different points 
r whose lengths and directions represent 
the corresponding vector v(r) (Fig.). 


~y / 
een, 


/ 
f 
Tied 


20.4-1 a) Scalar field 6b) vector field 


One usually writes the vector functions of a vector field in the form 

a= a(x, y, z,t) = u(x, y, z, thi + v(x, y, Zz, 1) + wx, y, z, 0) k, 
that is, the field vector a depends on the time ¢ and on the space coordinates x, y, z, which may 
themselves be functions of ¢. Differentiation of this vector function is reduced to differentiation of 
the scalar functions u, v, w by the definition 

da = dui+ dvj+dwk, 
where du = a dx + > dy + es dz + = dt with analogous expressions for dv and dw. An 
equivalent definition is the following: 


da a(x + Ax, y, 2,1) — 4%) _ OU, Ov, dw ye 
Ox Ox 


OX Ax-+0 Ax Ox 
and analogously 
da du dv dw da Ou dv ,, Ow da Ou. , Oy ,, Ow 
oa ou, , ov. OW oa Ou, OV. OW Oa _ ou oy OE. 
py oy tpt ay aoe tal ta” aa tart a 


That is to say, differentiation is carried out componentwise according to the usual rules. For example, 
let a, and a, be vector functions and 9 a scalar function; then the following rules for differentiating 


476 20. Integral calculus 


products hold: 


) _ Pa 8p _— 0 ) ) 
1) Bx (7) = ea + — Ax a and similarly for By (7a), re (ya), ay (Pm); 
) _ . 0a2 0a, : 2. ) : ) 0 : 
2) x a2)= a ae + aa a, and similarly for op (a; * 42), ae: (a; * a2), ar — (a, ‘ a2); 
3) (a; X a2) = a, X a + al X @2 and similarly for 
7 ) ) 
ap X @2), Bz (a X a2), By (a1 X 42)- 


In what follows, it will be assumed, to simplify matters, that all functions and the partial derivatives 
that occur are continuous, so that the order of partial differentiations may be interchanged; for 


) 

ae toy wee (see Chapter 19.). 

The most important special cases of vector functions are: 

1. The field vector a does not depend explicitly on the time variable ¢ and so has the form 

a(x, y, Z) = u(x, y, Z)i + (x, y, Z)F + wx, y, z) k; 

the field is then said to be time-independent. 

2. The field depends on a scalar parameter ¢f, not necessarily identical with the time variable; 
thus u = x(t), v = y(t), w = 2(¢), or 
a=r(t)= x(t)it+ wt) + 2t)k. 


The position vector r = r(t), as t varies, describes a path in space; if now ¢ is the time variable and 
Z ae aa 
—k= xi+ pj + zk 


instance 


r(t) the position of a particle, then the derivative = =a i+ ap j+ 


dt dt 
; d?r . ; dr. 
represents the velocity of the particle and a? its acceleration. The vector a's tangent to the 
path r(t) (Fig.). If r(t) has constant length, |r| = const, then also r? = const and hence r: = 


+ — -r = 0, so that r and a are perpendicular. 


20.4-2 Tangent vector 


Gradient and potential 20.4-3 Level surfaces and gradient 
Gradient. Consider the scalar function gy = g(r) = 9(x, y, z). If r is aes by 

dr = dxi+ dyj + dzk, then the change d¢ in @ is given by dg = a Bx att Fay + SP az. 

This expression can be regarded . = ere of the vector dr, eine si small asvement 

in space, and a vector grad 9 = —- i ace — ys +2 Ek, the gradient of py. The change in the direc- 


tion dr is then dg = grad 9 - dr. . in nile one chooses dr so that it lies in a level surface 
g = const, then m does not change, so that dg = 0; hence the gradient grad 9 is perpendicular to 
the level surface p = = const (Fig.). Now suppose that dri is at an angle # to grad 9; then dg = gradg - dr 
= |grad g| - |dr| cos 8, and hence the function increases most rapidly when ae = 0, that is, in the 


a of the gradient. If one also sets |dr| = ds, — 


The derivative of p in any direction is equal to the projection of the gradient onto this direction. 


Potential. The operation of forming the gradient led from a scalar field » = g(r) to a vector 
field grad gp. In general, one cannot go in the oes eae is, not every vector field is the 


gradient of a scalar field; vector fields a = oe i +4 — j+- — ? k which are gradients are called 


conservative, and ¢ is called the potential of a es ae 37. . 


20.4. Vector analysis 477 


Pp D 
Consider the line integral f adr = f[(udx + vdy + wdz), and let the path of integration be 


D Dp 
given in the parametric form r = r(t) = x(t)i-+ y(t) j + z(t) k, so that the line integral can be 
written as an ordinary ere 


t 

| (« =) dt = itor — + ot) — w(t) a dt. 

fo 0 
In general, this integral depends not only on the end-points Po and P of the path of integration, 
but also on the whole path. However, if the vector field a = u(x, y, z)i + v(x, y, z)J + wx, y, z)k 
and the path of integration r = x(t)i-+ y(t)j + z(t) & are defined in a simply-connected region G 
and have continuous derivatives there, then a = grad 9 is a necessary and sufficient condition for 
the line integral to be independent of the path. Then the following theorem holds: 

The line integral of a conservative vector does not depend on the path of integration, and is equal 
to the potential difference between the initial and final points of the path. Conversely, if the line 
integral { a dr depends only on the end-points of the path, then a is the gradient of a potential ¢. 

An equivalent statement is: 
| In a simply-connected region G the line integral f a dr vanishes for every closed path in G if and 
only if @ = grad pm for some scalar function ¢. : 
A necessary and sufficient condition for a = grad ¢ in a simply-connected region G is that 
ae eee ee 
Oy ax’ oz oy’ Ox oz’ 
that is, curl a = 0 (see Curl and Stokes’ theorem). 

Example: To calculate f(— y dx + x dy) round the unit circle with centre at the origin. — The 
Unit circle has the parametric representation x = cost, y = sin’. Hence f(—y dx + x dy) 

an 


= [i-sin t(—sin ¢) + cos¢(cost)] dt = | [sin? ¢ ++ cos? t] dt = 22. The integral does not vanish, 


Ou 
that is, the vector field a = —xi-+ yi is not conservative; this also follows from rE =—], 
dv 
whereas a +1. 


Divergence and theorem of Gauss 


Divergence. The divergence is a scalar field that can be derived from a vector field 
a = u(x, y,z)i + Ox, y, z)F + Wx, y, zk. 

As an aid to interpretation one thinks of the field @ as the velocity field of a fluid flow of density 
o = o(x, y, z). One assumes that the flow of the fluid is steady, that is, a and e do not depend ex- 
plicitly on time. The x-component a represents the distance per unit time in the x-direction covered 
by a fluid particle. If one considers a small cuboid (Fig.), with edges parallel to the axes, then the 
volume of fluid entering across the surface dA, = dy dz perpendicular to the x-axis in unit time is 
dA,a4 = udy dz, and therefore the mass entering per unit time is eo” dy dz. The mass leaving the 
volume element dA, = dy dz is 

ae 


o(x + dx, y, z) u(x + dx, dy, dz) dy dz = Jeu + 
d(Qi) 
Ox 


ds] dy dz; 


the difference dx dy dz gives the rate at each mass is lost 


from the volume element across the faces dA, and dA,. The total 
loss per unit time is obtained by adding contributions from the 
other two pairs of opposite faces and is equal to 


d(oz) , 9(e0) " d(oW) 
Ox oy 0z 

The loss of mass per unit of time and volume, the bil lead of 

the vector field, is therefore given r- - - 

by diva after putting a = 04, 

u = ol, v = ov, w = EW. 


| dx dy dz. 


gence 


If there is no loss or gain of mass, then div a = 0 and ike field is called source-free. Points where 
div a > 0 are called sources (more mass flows out than in); points with div a < 0 are called sinks 
(more flows in than out). 


478 20. Integral calculus 


Theorem of Gauss. The total loss of mass from a finite region G can be calculated from the volume 
integral [{f divadzt. This mass must have flowed out from G across its boundary surface S. If 
G 


ds = nado is a directed surface element, that is, a vector in the direction of the outward normal 

n(|n| = 1) of magnitude equal to the area do of the surface element, then the previous arguments 

show that the outward mass flow per unit time across do is a - ado. The total outward flow across S 

is therefore equal to the surface integral [f(a -) do = {f a, do. By equating the two expressions 
S S 


for mass flows, one obtains the theorem of Gauss. 


Theorem of Gauss | [J divade = [Janda = [[a,da 
ee 


ic | 5 +H) e~ foertnn tre .9 + werent 


This theorem, so intuitively clear in the hydrodynamical example, is true quite generally for con- 
tinuously differentiable vector fields when G is a bounded closed region whose boundary S can be 
divided into pieces with continuously varying normals, apart from finitely many curves or points. 
The theorem of Gauss makes it possible to convert volume integrals into surface integrals; it also 
leads to the intuitively obvious result: 


If a is source-free in a region, that is, diva = 0, then the total flow across the boundary of the 
region vanishes. 


Curl and Stokes’ theorem 


Curl. The curl is an operation of differentation, which converts the vector field a= ui— vj+ wk 
into another vector field. 


Stokes’ theorem. Suppose that C is a closed, not necessarily plane, 
curve bounding a piece of surface S. Suppose that S has a contin- 
uously varying unit normal a, except at finitely many curves or points. 
and that C has a continuously varying tangent, except at finitely many 
points. The sense of direction for C shall be that for which S lies on 
the left if one looks at S from the side indicated by a (Fig.). Then 
Stokes’ theorem holds for any continuously differentiable vector field. 20.4-5 Stokes’ theorem 


ere vam {ita $e) 60s (x, m) Lae 


=f. (= =a ~ cos (y, a) + re = 4 cos (z, m} do 


oz ax dy 


According to this theorem the integral [{(#- curl a) do depends only on the boundary curve C 
S 
and not on the form of the surface S bounded by C. The line integral f a dr is called the circulation 
Cc 


(or rotation) of a along C; it measures the strength of the rotational movement of a fluid element 
if the vector field is thought of as describing a fluid flow. Stokes’ theorem states that this circulation 
is equal to the total flow of the normal component of the vector field curl a across a surface S 
spanned by C. 

If one interprets a as a field of force, then —a dr is the work done against the force a along dr, 


and ¢ adr is the total work done after a complete circuit round C. It is zero, and therefore indepen- 
dent of the path C, only if curl a= 0. Fields for which curl a= 0 are called irrotational. An ir- 
rotational field @ can always be represented as a gradient a = grad q. Hence curl (grad g) = 0, 
a relationship that can also be verified directly. 


21.1. Series of functions 479 


The operator nabla, rules of calculation 


The three differential operators grad, div, curl can be written in terms of a single operator, the 
nabla operator, introduced by HAMILTON and denoted by V. The name nabla is that of a Hebrew 
stringed instrument whose shape resembles that of the symbol. 

The operator is defined as 


0 0 
Vain tie iy ar 
0 ; 
If one interprets the ‘product’ 3, 7 as od one can write Vo = gradg. The scalar product 


gives V: a = diva, and the vector product gives V X @ = curl a. 
Another differential operator, introduced by LAPLACE denoted by / and called delta, is defined 
for scalar fields m by 


0 07 07 

Ap = div grad y = V- (Vp) = ats at Br? 
but for vector fields a(x, y, z) by 

/\a = grad div a — curl curl a = V(V: a) — V X (V X y= Sa , 9a oa 

a=g Va url curl a = a a)= 3 ay? * O22 

The following identities hold: 

grad (9192) = 91 grad y2 + H2 grad qy, div (pa) = p diva + a: gradg, 

curl (pa) = pg curla — a X gradg, curl grad gm = 0, 


div (a, X a2) = a,+curla, — a, curla,, div curl a = 0. 


Finally, it should at least be mentioned that gradient, divergence, and curl of a field are objects 
that are independent of the coordinate system used. One says that these quantities are invariant 
under coordinate transformations. 


21. Series of functions 


21.1. Series of functions ............... 479 Geometrical applications of Taylor’s 
Power S€TIES ....... 0c cee eee eee 482 ThREOVEM oo ccc cee cence eens 494 
Convergence of power series ....... 483 Taylor’s theorem for several variables 495 
Important properties of power series 484 21.3. Trigonometric series and harmonic 
Taylor S€ri€S ... 0... ccc ccc wens 488 analysis ........ ccc cece een eee 496 
Series for special functions ........ 491 Trigonometric serieS ........0000% 496 
Approximations ..... ccc ccc eeees 492 Harmonic analysis and harmonic syn- 

AY kr 499 


Modern mathematics cannot do without the theory and applications of series of functions, which 
are very important in the development of analysis and the theory of functions. It was already pointed 
out in Chapter 20. that some functions can be integrated only by using their expansion in a series. 
Such expansions are also often useful in practical applications. They can be used to investigate 
properties of a function when only a few of its values are known, to calculate approximate values of 
functions, and to give rapid and reliable estimates for the accuracy of methods of calculation. 

In Chapter 18. the properties of infinite series with constant terms are treated. Now the terms 
of a series are functions of some variable; of special importance are power series, whose nth term is a 
function of the form a,x", and Fourier series, whose general term has the form a, cos nx + b, sin nx. 


21.1. Series of functions 


Properties of a series of constant terms were derived by considering sequences of numbers. These 


arguments will now be extended by considering expressions F(x) = z= f(x), which have the fol- 
lowing interpretation: 

1. For each natural number n = 0, 1, 2,... a function of the sequence fo(x), F(X), «+5 fn), ++ 
is given, each of the functions is defined for x ranging over an interval J, that is, for each x in this 
domain of definition it assumes a unique value of its range. 


480 21. Series of functions 


2. The sequence F,(x) of partial sums F,(x) = fo(x) + f,(x%) + -:: +F,(%), n = 0, 1, 2, ..., consists 
of functions that are defined, for each 7, in the interval J. 
3. For each x in J the sequence F,(x) tends to a limit, which is denoted by F(x) = lim F,(x). 


n= 0O 
This limit function exists in the interval I, which is called the convergence interval. The difference 


R,(x) = F(x) — F,(x) between the limit function F(x) and the approximation F,(x) is called the 
remainder and must tend to zero, as m — oo, if convergence is to take place. 


Uniform convergence. In the series of functions F(x) = x? + x?(1 — x?) + x7(1 — x?)?+ 
co 
= Y x7(1 — x)", the terms f,(x) = x?(1 — x?)" are continuous. The sequence of partial sums 


n=0 
Fo(x) = x?, F,(x) = x? + x?(1 — x?), ... converges for all x +0 in the interval -l1<x< 1 
because the series F(x) = x?[1 + (1 — x?) + (1 — x?)? + -:-] is a geometric series with the ratio 
1 — x? = q< 1 and its sum is F(x) = x?{1/[l — (1 — x?)]} = 1. On the other hand, F(0) = 0. 
Thus, the limit function F(x), unlike the functions f,(x), is not continuous at x = 0. 

This raises the question under what conditions properties, such as continuity or differentiability, 
of the terms /,(x) of the series can be transferred to the sum function F(x). The diagram of the partial 
sum curves for the series F(x) = 5 x?(1 — x?)" gives a hint. For 2 > 10, the curves can hardly 
be separated from each other when |x| > 0.6 but for x = 0.2, for instance, they differ quite ap- 
preciably (Fig.). That is to say: the index N beyond which the size of the remainder R,(x) lies below 
a given positive number e depends, in general, on the choice of x within the interval J. These indices 
N(e, x), in the present example, increase indefinitely as x + 0 for any given « > 0. If, however, 
one can find an N that does not depend on x, one speaks of uniform convergence of the function 
series F(x). This is always the case if the set of numbers N(e, x) is bounded above whenever é is 
fixed and x varies over IJ. It will be shown in what follows that in this case F(x) is continuous, and 
also that the series F(x) can be integrated ‘term-by-term’ and can be differentiated in this way if 
the functions f,(x) are all differentiable over J and the series resulting from term-by-term differen- 
tation is uniformly convergent. On the other hand, the series )” x?(1 — x?)" considered earlier is 
convergent, but not uniformly convergent, in the interval —I1 <x < 1. 


-7 -O8& -06 -04 -“Q2 0 0? O04 O6 08 i 
oo 
21.1-1 Approximations F,(x) to F(x) = 5x1 — x") for n = 0, 1, 5, 10, 20, 100 


A function series F(x) = by f,(x) is called uniformly convergent in an interval / if, for each given 


0 
e > 0, there exists an N = N(e), depending only on ¢ but not on x, such that 
|R,(x)| = | fear (®) + Soaa(x) +---| < e for every x in J provided that m > N. 


The concept of uniform convergence was introduced by Karl WEIERSTRASS (1815-1897) and 
others; it can be extended, as can the following criterion, to the complex case. 


The Weierstrass majorant criterion (“M-test’). If each function n(x) in the series F(x) = Ps f(x) 
is bounded, with | f,(x)| < M, for all x in J, and if the series }° M, ciavenene® Gases poles 


21.1. Series of functions 481 


F(x) = Bs f(x) is wnlformiy convergent in the interval 7. The series > M, is then called a con- 


n=() 


vergent “majorant for Bs Jn(x). 


Example I: The series z sin (nx)/n?* is uniformly convergent for all x, because |sin (nx)/n?|< 1/n? 
for all x and the nado series Dy 1/n* converges. 


n= | 
Example 2: The geometric series x + x? + x° + ++. converges in the interval —1 < x < 1. 
For a fixed xp with 0 < x9 < 1 and given e > 0 one can always find an N(e) such that for n > N(e) 
[Ra(Xo)| = |xot? + xg*? + --:| = x9tt/(1 — Xo) < €. But if x increases towards +1, then |R,| 
increases indefinitely for any fixed n > N(e), that i | lim x5" 1/(1 — xg) = co. This shows that the 


geometric series converges uniformly in any closed Portion |x| < x9 <1 of the convergence 
interval —1 < x < 1, but does not converge uniformly in the whole open convergence interval. 


It can be shown that instead of the remainder R,(x) one may consider an arbitrary section 
Finsx(x) — F,(x) of the series; the condition for uniform convergence then reads 
[Fnza(x) — Fr(x)| = |fne1@) + Fng2) +) + Sree] < 
for all x in J, all nm > N(e) and all &k > 1. 


Limits of a series of functions. Suppose that the series F(x) = pI f,x(x) converges uniformly in 


the interval a < x < xo. Then, for given « > 0 one can find an fidex n, such that for all x in the 
interval, all mn > n, and all k > 1, 

[Fna1(X) + Snz2(~) + + Srac(X)| <e. 
Suppose, in addition, that each function f,(x) has a left-hand limit a, as x > xo — 0; then it may 
be substituted in the inequality |a,,; + dn,2 +--+: + an,,|< €, but this means that the series 3’ a, 
formed from the limits converges. If its sum is s and its partial sums are s,, an index nm. may be 
chosen so that |s,, — s| < ¢/3 for all m > nz and also |R,,(x)| < ¢/3. This can be used to show that 
the limit function F(x) has the limit s as x > xg — 0. Indeed, for the chosen but fixed m and all x 
inaq<x< Xo 

|F(x) — s| = |CFin() — Sm) + (Sm — 5) + Re€X)| < |Fin() — Sel + €/3 + €/3. 
But here the function F,,(x), being the sum of a fixed number of functions /,(x), has the limit s,, 
as X + Xo — 0; this means that for a positive 6 < (x9 — a) a subinterval x9 — 6 < x < Xo can be 
found such that |F,,(x) — 5,| << ¢/3 and hence |F(x) — s| < « for all such x. The sum s is therefore 
the limit of F(x) as x + xg — 0. The result can be summarized in the following form: 


lim Ly In(x)] = Dy [ lim jd: 


X—+X—-0 a-O0 n=Q x—>X>— 


which means that in a uniformly convergent series of functions passages to a limit can be carried 
out term-by-term. 


Analogous arguments hold for right-hand limits. If the functions f,(x) are continuous at xo, 
then their left-hand and right-hand limits are equal to their values f,(xo), that is, the sum function 
F(x) is continuous at Xo. 

If the series F(x) = > f,(x) is uniformly convergent in an interval / and if the terms /,(x) are 
continuous at x = Xo, then F(x) is also continuous at x = Xp. 

Differentiation and integration term-by-term. If the functions /f,(x) are differentiable in /, a the 
series f(x) = Py f,(x) formed with the derivatives is uniformly convergent in J, and if Z fx) 
converges for : at least one x = = Xo in I, then this series converges uniformly for all x in J, and the 
derivative of its sum F(x) = Dy F,(x) is the sum of the differentiated series, that is, F’(x) = by Su). 
The detailed proof of these we tementé is omitted; they are often stated in the less precise ae 


Although he did not have the concept of uniform convergence, ABEL (1802-1829) gave the 
following examples to show that termwise differentiation need not always lead to correct 


A series of functions may be differentiated term-by-term if the 
resulting series is uniformly convergent. 


482 21. Series of functions 


answers: 

1. The series Py sin (nx)/n converges for all real x, but the series obtained by termwise differentiation 
is = COS 1x which diverges for all x. a The shia 2 S,(x) = By sin (nx)/n? converges uniformly 
for : all x, but the differentiated series Px Kix) = pS cos “elk diverges for x = 0. 


If the terms Sux) of a series are inicerable on an interval I and if the series is uniformly convergent 
in J, then the series obtained by termwise integration also converges in J and represents the integral 


JF) dx of the limit function F(x) = y fx), that is, [ F(x) dx = i S fal) dx. 


A uniformly convergent series of functions may be integrated term-by-term. 


Arc length and circumference of an ellipse can be obtained by termwise integration of the series 
for the element of arc. For the arc s (Fig.) of the ellipse, let x = asing, y = bcosg, so that 
dx = acos 9 dg, dy = —bsin 9 dg, and let «? = 1 — b?/a? (e is called the eccentricity). Then 


? P 
s = V(dx? + dy?) = af (1 — é? sin? 9) dg, 
0 
where use has been made of the fact that 
a? cos? y + b? sin? g = a? — (a? — b?) sin? 
= a*(1 — &? sin? 9). 


Since |e| << 1, the square root may be expanded in a uniformly 

convergent binomial series 

Y(1 — ¢? sin? ¢) 
e2 = 4 

=Toysin P34 

Term-by-term integration leads to 


sin* y — 


1-3. 
2-4-6 


2 g 4 “4 
s=a( ~ + [| sin? dp — > | sin* dy — ---) 
0 0 


which makes it possible to calculate the arc length s for any eer Parametric representation of 
angle gy. To find the circumference, put g = n/2 and calculate #° ©™IPS® 

the length of a quarter of the circumference, using the recurrence 

formulae in Chapter 20. for the derivation of Wallis’ precucr to obtain: 


To estimate the error 4 caused by using the approximation formula, one first uses the relations 


(a + b)/2 = (a/2) [1 + yd — €7)], (ab) = a Y(1 — &?), to expand these expressions in binomial 
series with remainders R4 and Rj for which upper estimates r’ and r’’ can be found. From these 
series one can then find a series for x[3(a + 5)/2 — )(ab)], which differs from the exact series only 
by terms in e® and beyond. If r is an upper estimate for the accuracy of the exact series, then one 
gets the error estimate 4 < r+ 3r’ +r”, and if one carries through the calculations, it turns out 
that A < 0.468/(1 — «7). 


21.2. Power series 


Power series are special series of functions, in which the functions are powers of the variable 
multiplied by a coefficient, f,(x) = a,x". The partial sums F,(x) = @9 + a,x + --- + a,x" are 
polynomials, which are defined for all x. The range of convergence of the power series F(x) 


= an n F(X) = 2 A,X" = Ay + a,x + a2x* +--+. must be examined in each particular case. It can 


happen that the : tes is always convergent that is, convergent for all x, or is mever convergent except 
when x = 0. 


21.2. Power series 483 


oo 
Examples: 1. The series F(x) = X' n"x" = x + 4x* + 27x? + 256x* + +. is never convergent. 
n=l 


o Fn : x? x3 x* 
2. The series F(x) = X =x+—++ —+ —-+ + is always convergent. 
nel ma! 2 6 24 


Convergence of power series 


For each fixed value x = x9 the terms of the power series can be regarded as constants so that 
the results in Chapter 18. can be used. In particular, the concept of absolute convergence, 
that is, convergence of the series of absolute values, is applicable to power series. It can be proved, 
though the proof will not be given here, that the power series >” a,x" converges absolutely whenever 
|x| << |x,| if the series 3’ a,x} converges. 


The radius of convergence of power series. A positive num- 
ber r is called the radius of convergence of a power series if the 
series converges for every x with |x| <r, but diverges for 
|x| >> r; the interval from —r to +r is the range or interval of 
convergence (Fig.). One may put r = oo for an always conver- §————— 
gent series and r = O for a never convergent series. - 0 #P 


Theorem of Abel. For every power series that is neither 
always convergent nor never convergent, there exists an r > 0 21.2-1 Convergence interval of a 
such that the series converges for |x| <r and diverges for § POWEr Series 
|x| > r. 
The Cauchy-Hadamard formula. A formula for finding the radius of convergence was stated in 
1821 by Caucny (1789-1857) but attracted no attention. It was only rediscovered 70 years 


later by HADAMARD (1865-1963). One considers the upper limit yu = Tim Vla,| of the sequence 


2 3 n 
la;|, V\a2|, V|as|, ness V|anl, 


that is, the number yu with the property that for every ¢ > 0 infinitely many terms of the sequence 
are greater than u — é, but only finitely many are greater than wu + «. 
If uw is finite and positive, 0 < wu < +00, then 1/u is also finite and positive, and one can find 


x, and @ such that |x,|<@< 1/u, so that 1/e >. This means that V|a,| < 1/o or Vianx? | 
< |x,|/o 2 1 for all x > N,. The power series therefore COnVEIRES absolutely at x,;. On the other 


hand, if |x2| > 1/u, then Viaal > 1/|x2| or |a_x3| > 1 for 
infinitely many 7, so that the series diverges at x2. Thus, 
the number r = 1/ is the radius of convergence. 

The following result, stated without proof, sometimes 
makes it possible to use the sequenec (|a,,;/@,|) rather than (Va,|). 


pt CARE gent —— 


arvergent | 


aivergent 


If the sequence {\a,,,/a,|} converges to a limit, then the sequence {Vaal} also converges and to 
the same limit. 


Example: The power series Ex5 oy x"In, x Cal | hee x x"/n? (p > 0 fixed) 
= i= 

all have the same radius of convergence r = 1 It suffices to calculate, for each p = 0, 1, 2, ..., 

the following limit: Jim n [any 1/4 = * lim |n?/(n + 1)?| = lim [1 — 1/(n 1 1)’ = 1. 


A= O0 
It is not possible to cae any eaeraily valid assertions about the behaviour of a power series 
at the ends x = +r and x = —,r of the interval of convergence; a separate investigation must be 


made in each particular case. For instance, in the first three series of the preceding example one 
finds that: 


1. the series 2 x" diverges for x = —1 and x = +1; 
n=] 
2. the series x x"/n converges for x = —1 and diverges for x = +1; 
n=] 
co 
3. the series  x"/n? converges for x = —1 and x = +1. 


n=] 


484 21. Series of functions 


oo 
If a power series 5° a,x" has the radius of convergence r, then it converges absolutely for any x 
with |x| < r. n=d 
Uniform convergence of power series. A theorem due to ABEL states: 
A power series converges uniformly in every closed interval that lies entirely inside the interval 
of convergence. 
According to this theorem all results on series of functions obtained on the assumption of uniform 
convergence are valid for power series. Hence in every closed interval inside the interval of con- 
vergence f(x) = Py a,x" is a continuous function whose integral can be obtained by term-by-term 


integration. Its aeeivalive can be obtained by term-by-term differentiation, as will be shown later. 

Power series with a complex variable. For power series with complex coefficients and a complex 
variable, the interval of convergence is replaced by a circular disc, whose radius is again called the 
radius of convergence (see Chapter 23.). 


Important properties of power series 
co 
Identity theorem for power series. The power series f(x) =  anx" is a continuous function inside 
n=0 fo 2) 
its convergence interval |x| <r, and in particular at x = 0. If the power series g(x) = » b,x" is 
n=O 


defined in the same interval, and hence continuous there, and if there is a sequence x, with in- 
finitely many non-zero-terms and x = 0 as an accumulation point, then it follows from f(x,) = g(%,) 
for all x, and from lim f(x,) = do, lim g(x,) = 50, that ag = bg. Since x, + 0, one can now consider 
k—0o k— 00 

two new functions 

fix) = Sn) — 40)/X_ = 41 + G2x_ + Agx% + °°, 

81(X_) = (8(Xg) — 50)/X_ = 51 + boxy + 53x72 + °°, 
for which again f,(x,) = g1(x,), so that one obtains a, = 5, letting k + oo. The procedure can be 


repeated to obtain a2 = b2, and by induction it follows that a, = 5b, for all n, so that the two power 
series are identical. 


If the power series Ps a,x" and By b,x" converge for |x| < r and if their sums coincide on a sequence 
of points x, with x, + 0 and Xe ~ 0, then the series are identical, that is, a, = 5, for all a. 


The identity theorem also holds for power series of the form x a,(x — Xo)". If a function f(x) 
n=OQ 


can be represented, in a neighbourhood of x9, by such a power series, then this representation is 
unique: if two methods of calculation lead to two power series representing a given function, then 
the coefficients of corresponding powers must be equal. The method of equating coefficients, which 
was derived in Chapter 5. is therefore applicable to power series. 


Example: For arbitrary real numbers a and 6, (1 + x)* (1 + x)® = (1 + x)**". In the domain 
of convergence |x| < 1 each factor can be represented by the binomial series 


(+ ay = 5 (8) ats + ap = 5 (7) a (1 + xe? = id elegy 


n=O n=( nl 


If one now uses the theorem about multiplication of power series, one obtains 


E(x) =. ((0) (o) + (1) (n= 1) +> + () (0) = 


Comparison of coefficients leads to a surprisingly simple proof of the addition theorem for binomial 
coefficients. 


Transformation to a new centre. All the results found for _BOWEE series remain valid if one uses 
(x — Xo) instead of x as a variable. The function f(x) = Pa a,(x — Xo)" is continuous inside the 
interval |x — xo| <r. Now if x, lies in this interval, it can be ‘taken as the centre for a new expansion 


21.2. Power series 485 


f{@%)= s b,(x — x,)* of the same function f(x). 
k=0 


Its radius of convergence r, is at least r,; = 
r—|x,;—Xpo|. For each point x in the interval 
lx—x,|<r, the relation |x;—xo|+|x—x,|<r 
holds (Fig.). Therefore if one substitutes x — xo 
= (x; — Xo) + (x — x;) in the series in powers of 


x — Xo, then not only is this series f(x) Pete. | Ee - Xgth =xyeh, 


= = 5 a,{(x; — Xo) + — xX,)]" absolutely convergent, 21.2-2 Transformation of a power series to a 
new centre 
but. als the series p24 lan| (x1 — Xo| + |x — x,|)" 


converges. Under these assumptions the major rearrangements theorem holds (see Chapter 18.): 
If one expands each term [(x; — Xo) + (x — x,)]" by the binomial theorem for n = 0, 1, 2, ... and 
collects together powers of (x — x,) as follows: 


f(x) = ao(x; — x0)° 


+ ay(x; — Xo)" + ay (;) (x1 — Xo)*~* (x — xj) 
+ g(x, — x0)? + a2 (1) (es — to)?" — 21) + 2 (5) (x1 — 0)?-? (% — 2)? + 


+ asx, — xo)? + a5 (1) Gey — 0)°1 Ge — x1) +43 (5) (a — x0)? — a)? + 


ee ee ee ee ee eS ee ee 


+ 4,(x; — Xo)" + ay (7) (x; — Xo)""*? (x — x1) + ay & (x; — Xo)""? (x — x)? + 


f(x) = bo Si by * (x — x) a ba - (x — x1)? + 


00 os) 1 co 2 
where by = 5 (5) an(vr — x0)", br = 3 ("1 ) ang alts — 20)", ba = 5 ("5 * anger — x0)" = 
0 0 0 


n= n= n= 
and each column contains an absolutely convergent series, the column sums form an absolutely 
convergent series, and for can x in the interval |x — x] <_r, the function value f(x) is given by 


E byl — x1)¥, Hence f(x) = 3 by(x — x1), with by = F 5 (" : *) dyyy(%1 — Xo)". 


Term-by-term differentiation of power series. The transformation just discussed serves to obtain 
power series expansions centred on an arbitrary point x, in the interval of convergence |x — xo| <r 


of the original power series (centred on xo). The representation f(x) = x b,(x — x,* shows that 
the function f(x) is differentiable at x = x, because f(x,) = bp, and so *=° 
Lf (x) — fx) — x1) = by + b2(% — x1) + b3(x — xy)? + 


and lim f(x) SOD — 2) =f = =F ("7 ‘) Ongs(X4 — X0)°. 


XX 


This holds for every x; inside the convergence interval, and therefore 
(oe) 
f= * (nm + 1) Qayi(x — Xo)". 


This is precisely the series obtained by term-by-term differentiation of the series f(x) = Da a,(x — Xo)". 
This procedure can be repeated, and the next step leads to 


U(x) — fC) — x1) = 2! bg + 3! b3(e — x) + 


and 
1p, oo (n+ 2 
aT (x,) = 562, where b2 = 2 ( ; Any2(X1 — Xo)". 


This series again being absolutely convergent. Proof by induction leads to the theorem: 


A function represented by a power series is differentiable arbitrarily often at every interior point 
hot caer Be aaa alas The derivatives may be calculated by term-by-term differentiation of 
power series. 


Sum, difference and product of two power ici For any point that lies inside the convergence 


interval of each of two power series f(x) = pai a,x" and g(x) = z b,x", each series converges ab- 
a=0 


486 21. Series of functions 


solutely. From theorems in Chapter 18. one can deduce that the sum, difference, and product of 
the functions f(x) and g(x) may be represented by power series with appropriate coefficients. 


For each xt that belongs to the domains of convergence of each of the power series f(x) = r a,x" 
n=( 
and g(x) = Ps = has f(x) + g(x) = SG, + 5,) x". 
Further, f\ (x) ae z (doby + aybn_, +: at dnb.) x", where this series converges absolutely. 


The coefficients in the product series can be found as diagonal sums in the array of products or 
by the method of sliding strips (see Chapter 18.). 


Example: The geometric series 1/(1 — x) = 1 + x + x? + x3 + x*-+ --- converges for |x| < 1. 
As one can see from the position of the sliding strip for the third or fourth term of the product, 
multiplication of the series by itself leads successively to 

1/1 — x)? = 1+ 2x + 3x? + 4x?4-- and 1/(1 — x)? = 1+ 3x + 6x? + 10x7+-- 


| 1+ 2 + Fre tbxit+ ... 


Further examples concerning powers of the sine and cosine series will be given after these series 
have been derived. 


Substitution of one power series into another. If in a composite function (see Chapter 19.), for 


instance, in y = f[y(x)], the inner function z = 9(x) = Dy a,x" is represented by a power series 
and if, for x inside the domain of convergence, this series esuuies values z that lie in the domain 
of convergence of a power series f(z) = x b,z" which represents the function y = f(z), then this 
defines a composite function y = F(x) — Fie(x)]. One may then ask whether this function F(x) can 


co 
be represented by a power series F(x) = » c,x", and how the coefficients c, can be found in terms 
of the coefficients a, and 5,. Now nnd 


F(x) = bo + by(@o + 44x + ++) + b2(@q + ayx +--+)? + 
and the powers z* of z= > a,x" can, by multiplication of power series, be written as power series 
Z¥ = Ago + Agi X +20 + Agyx” + °°. : 
If one now substitutes these expressions in the series f(z) = ) b,z* and collects terms involving 


the same power of x to obtain a power series > c,x", it can be proved that this series converges 
absolutely and represents the function F(x) = f[y(x)]. 


Division by a power series. The task of dividing a power series )” b,x" by a power series )” anx" 
can be reduced to finding a product, provided that 1/)° a,x" can be represented by a power series. 
If one assumes that a AnX" = Ag + (a, X + a2x? + ---) = ag +2 has the radius of convergence 
r>O and that a) is not zero, one can find part of the range of convergence in which 
\z| = A + 7 +-:. : < |aol, and then the reciprocal of >” a,x" can be expanded in a geometric series 

1 z z Z 

sar = ao I+ TE ale 7 ag = aga 
within which z = a,x + a2x? aa - also converges. By substituting the power series for z in the 
geometric series and collecting terms, as was done earlier, one obtains a power series ' c,x", which 
converges absolutely and represents the function 1/>" a,x". Once one knows conditions under 
which the power series 5 c,x" exists, it is easier to find the coefficients c, from the identity 
> anx" 3S Cyx" = 1 and comparison of coefficients. This leads to a system of equations from which 
the unknowns Co, C;, C2, --- can be calculated step-by-step: 


QoCo = 1, age, + ayCg = O,..-, — AQCn + GoCn_1 + °° + GnCo = O,.«.. 
More generally, if > b,x" is to be divided by 5 a,x", the identity D' b,x" = 2’ a,x"- DJ) c,x" and 
comparison of coefficients leads to equations to determine the c,. 


Example: male anges 


—+---, which converges in an x-interval 


sin x => x — Ean: date igi ae SE 
, 3 5! , 2! 4! ; 


to be derived later, can be used to find a power series for tan x = sin x/cos x by division. The 
conditions for the existence of the power series 5° ¢_x" imply that division is possible only in that 


21.2. Power series 487 


part of the domain |x| < r = co of the cosine series in which cos x + 0, that is, in the interval 
|x| < 2/2. The identity for finding coefficients reads sin x = cos x* J” c,x". 


The position of the sliding strips for cg is shown above, One obtains 
Co ] 


c c c 

Seas Sei Ca ase ST Oa a 

and, step-by-step, 

] l I 1 l 1 2 
ges AAs Nae eae ae gp ge A OO a a Oe ol ae? 
1 | ] 2 17 
a atl | PUT) Oy (MO CP ae 
: : = | a eae 17x? 
that is, the expansion, valid for |x| < 2/2: tan x = x + Swag LE Las 
2 

Bernoulli numbers. If one uses the expansion, to be derived later, e* = 1 + 47 = r+ a ++ 

for the ica function, then the function 2! 
1 
Se 2/21 
$0) = Sy = TERRE RTE RAT pw = Bo + Ball + Bax?/2t + 


satisfies the conditions that permit the division, hence the expansion in a power series. As indicated 
in the above equation, the coefficients are put into the form B,/n!. The numbers B, are called 
Bernoulli numbers and can be calculated from the identity 
1 = (1 + x/2! + x2/3! + ++) [Bo + (Bi/1!) x + (B2/2!) x? +]. 
Comparison of coefficients (Fig.) leads to the relations 
Bo = 1, Bo/2! + B, = 0, 
Bo/3! + By/(1! 2!) + B,/2! = 0, 
Bo/4! + By/(! 39 + B2/(2! 2!) + B3/3! = 0, ..., 
or Bo=1, 2B, + Bo =9)0, 
3B, + 3B, + Bo = 0, 
4B, + 6B, + 4B, + Bp = 0,7 
5B, + 10B; + 10B, + 5B, + Bo = O,... 
This leads, step-by-step, to 
Bo=1, By =—1/2, B.=1/6, By3=0, Bs = —1/30,... 
From n = 3 onwards all B, with odd index n are zero. 


Inversion theorem for power series. Under suitable conditions of monotonicity, a function y = f(x) 


has an inverse function x = g(y) (see Chapter 5.). For power series an analogous theorem, 
which will not be proved here, can be derived. 


For a power series y = f(x) = a,x + a,x? + a,x°+--- with radius of convergence r and a, + 0 
there exists exactly one power series x = p(y) = 5, y + by? + b3y° +--+ that is convergent in 
a neighbourhood of y = 0 and such that »y = f[p(y)). 


Once it has been shown that the power series x = b;y + b2y” + --- has a positive radius of con- 
vergence r;, its coefficients can be found by substituting this series in each term of the given power 
series y = a,x + a,x? +--+. and equating coefficients of powers of y. This determines the coef- 
ficients uniquely, so that there is only one power series expansion for x = g/(y). 


x x? 
| ae - the coefficients 5, in 


x = Arcsin y = b,y + by? + b3y? + --- can be obtained by substitution as follows: 
y = by + bny’ + b3y? + bsy* + bsy* 


—(1/6){ —b3y? { 3b3bsy4 + 3b,b3y% + abi! ++] 
+ (1/120) [ b3yS +++], 


Example: From the power series y = sin x = x — 


488 21. Series of functions 


so that hence 
l=65,, O=0d, 0 = b;, — b}/6, seks 
= rie — b,b32/2 ae b7b;/2 + 65/120, .. 
This leads step-by-step to 
6=1]1, &=0, b=1/6, b,=0, 6, = 3/40, ... 
or x = Arcsin y = y + y3/6 + 3y3/40 + +. 


Taylor series 


A power series a,x" with positive radius of convergence r defines a function f(x) = a,x" 
which is a continuous function of x for |x| < r. Repeated differentiation term-by-term of the series 
gives the derivatives of arbitrary order of the function f: (x). On the other hand, if a function f(x), 
such as sin x, /(1 + x?) or arctan x, is given, it still remains to be shown that this function can be 
expanded in a convergent power series and how the coefficients in this expansion can be determined. 
This problem was solved by Brook TAYLOR (1685-1731) and Colin MACLAURIN (1698-1741). 


If the given function f(x) can be expanded at all as a power series, f(x) = @g + a,x + ax? 4+-:- 
+ a,x" + ---, then f(x) must be differentiable arbitrarily often. Differentiation term-by-term, and 
then setting x = 0 leads successively to 


f(x) =a, + 2agx +e) + . “see —.————»~f (0) =a), 
f(x) = 2a, + 3+ 2xa3 ++: + a(n — 1) a,x"? + --; —e f"(0) = 2a, 
fxd) = 3-2°+ lag +e + tn — 1) (nx — 2) x83 + ee (0) = 3! 3, 
fO) = ala, + + 1): 2a + fF 70) = nl, 


The power series, provided that it converges, then takes the form of a so-called MACLAURIN 
series f(x) = f(0) + (f’(0)/1!) x + (f’(0)/2!) x? +.---. Similar considerations hold for a power 
series >” a,(x — Xo)" with centre xo and lead to the so-called TAYLOR series 

f(x) = f(x0) + F'(%0)/1) (& — x0) + SF (x0)/2!) (& — x0)? + 
If one replaces x by xo + A, one obtains 


f (xo + h) =f (x0) + F’xo)/1!) A + SF’ (X0)/2!) Ah? + 
Taylor’s theorem. To investigate the convergence of these series one introduces the partial sums 
and the remainder R,, (see Taylor’s form in Chapter 18.): 
f(xo + Ah) =f (Xo) + SF’ ro) AA + fF (X0)/n!) A" + ORa. 
The remainder R, represents the difference between the given function and an approximating 
function, and can be estimated in terms of the (7 + 1)th derivative of f(x). This form of the remainder 
usually shows at once that it has the limit zero as n + ©o¢ so that the series converges to f(x). 


Taylor's theorem. If the function f(x) has a continuous ath derivative f‘"’(x) in the closed interval 
from x, to x, + A, and if its (m + 1)th derivative exists at least oni this interval, then the remainder 
Rn in f(xo + A) = f(Xo) + U'(x0)/1!) A + Uf" (X0)/2!) A? + + (f(xo)/n!) h" + Rp, 
can be written in 


a) Lagrange’s form: there exists a number ? with 0 <— @ < 1 such that 


An+l 
—_ f(a) 
R, = (ne iy! f (xo + @4A), or 
b) Cauchy’s form: there exists a number @’ with 0 < &’ < 1 such that 
+1 
rR, = - = a— oy set) (xg + OA). 


To establish these forms of the remainder one uses an extension of the mean value theorem of 
the differential calculus, which states that for two continuous functions F(x) and g(x) that are 
continuously differentiable inside the interval [x9, xo + A] with y’(x) + 0 there exists a number # 
with 0 < #< 1 such that 


F(xo + h) — F(Xo) _ F’(xo + OA) 

P(Xo + h) — Y(Xo) P(X + Bh) 
Choosing the function 9(x) as 
Q(x) = (xo + h— x)"t!, onehas 9(%) =h"™*1, Hx) +4) = 0, 
p(x) = —(4+ 1) (% + A — x)". 


and 


21.2. Power series 489 


The function F(x) is obtained from the remainder 


Ry = fF (xo + h) — f(x) — Af’(x0)/1! — ++ — Af (x9)/n! 
by putting x, for (xo + A) and (x; — xo) for 4 and then making xo a variable x. This function 
F(x) = f(x1) — f(x) — G1 — x) f’QO/1! — 1 — 2)? f()/2! — + — Or — x) fF (%)/n! 


is continuous in the interval [xo, X> + A], differentiable in its interior, and takes the values 
F(xo)= Ra, Fo tA =0, F(X) = Oy — x)" ft (x)/n!. 
Consequently the generalized mean value theorem yields 


Ry, — [A — #)"/n!] f(x + BA) 
or —Antt —(n+ 1A" — 8)" ‘ 


R, = [h"**/(a + IN SOP (xo + BA), 
that is, Lagrange’s form of the remainder. 
Different choices of the auxiliary function g(x) lead to other forms of the remainder. Cauchy’s 
form is obtained by choosing 9(x) = x9 + A — x. 
Remainder in the Maclaurin series. Taylor’s theorem also holds for this series. The remainder 
takes the forms R, = [x"t#/(n + 1)!] f@+(Ox) (Lagrange), R, = [x"*!/n!] (1 — &)" ft D(9’x) 
(Cauchy). 


Trigonometric functions. The functions sin x and cos c have the following derivatives at x = 0 


| sinxlf0) =f*»0) = sind= 0 Lossy fo) =f 0) = cosO= 1 
L sea = f(*+10)= cosd0= 1 (0) =f'**+10) = —sinO= 0 
f"O) =fC"0)= —sind= 0 f'"(0) =f **+200) = —cos 0 = —1 
f’"(0) = f****+90) = —cos 0 = —1 f°") = f*490) = 3sindO= 0 
Putting 2 = 2m one obtains for arbitrary x and withO< @< lor0<&< 1 
; x3 x> x! er, x2m-1 xzmri 
sin x = xX — 31 +—=—— 51 op a Gmapr tC Gm FD! 
_ x? x* x® ad x2m—2 
COS eae ag eg ead) maar + Gmr yt cos (?’x). 


Both remainders tend to zero as m— oo for any x, so that both series always converge. 
Multiplication of these series by themselves leads to series for powers of the sine and cosine 
functions. The addition theorems can also be used to obtain these, for example, 


_ 1 (2x)? (2x)* (2x)® 
sin? x =5 Cl — 608 2x) = |S a + 1 
From sin x/cos x a series for tan x has already been obtained by division. Series for 1/cos x, x/sin x 
and x cot x can be obtained similarly by division. 


Exponential and hyperbolic functions. Since all eee a the function e* are equal to e* and 


ntl . 8 \ 
x : , 

Ge aye TH! e’* with O< b< 
For the general exponential a* = e*!"2 an sneingcis series, sven for all x, can be obtained. 

The Taylor series can also be used to obtain the addition theorem for the exponential function, 
because 

exoth — ere + ereh/1! + exoh?/2! + 

BIVES —exoth = exo] + A/1! + A2/2! + ---) = ere, 

The definitions sinh x = (e* — e~*)/2 and cosh x = (e* + e~*)/2, together with the power series 
for the exponential function, lead to power series for these hyperbolic functions. The series for 


therefore are 1 atx = 0, one obtains e* = 1 cae yE 4 . +o 


490 = 21. Series of functions 


tanh x = sinh x/cosh x and coth x = cosh x/sinh x can then be obtained by division of the power 
series occurring in the numerator and denominator. 


d" In x 


Logarithm. For the logarithmic function In (1 + x), f(1) =0 and 
so that f((1)/n! = (—1)""1/n. Taylor’s theorem therefore yields dx" 
In (1 + x) = x — x?7/2 + x3/3 — 4) + (1971 xn + (1) x44 [a + 1D 1 + 8 x)"*4], 
where 0 < #< 1. The remainder term tends to zero as n> co when 0 < x < 1. The same series 


can be derived, for the interval |x| <1, by termwise integration of the geometric series 1/(1 + x) 
=1—x+x7—x34-. 


= (—1)""* (@@ — IIx", 


This series is usually unsuitable for calculations because it converges too slowly unless x is ver 
small. 


From In(1 — x) = —x — x?/2 — x3/3—-» and 
In [1 + x)/ — x)] = Indi + x) — Indi — x) a 
obtains a series which converges when |x| < 1; 


for & > 1, 1/§ = x <1 one obtains the second on 
mula. 


To calculate logarithms by using series, it is desirable to combine rapidly convergent series. Thus, 


In2 = 7 In (10/9) —2In (25/24) + 3 In (81/80), because ere 2. The most slowly 
803 - 252 - 97 
convergent series involved here is 


10 1 1 1 1 

in (> 9 >)- =n (1 a, = Jo + 2-100 * 3-100 7" 
Similarly 

In 3 = 11 In (10/9) — 3 In (25/24) + 5 In (81/80), 

In 5 = 16 In (10/9) — 4 In (25/24) + 7 In (81/80). 


It may have been the diversity of mathematical methods available that led GAuss to remark: 
‘There is a kind of poetry in the calculation of logarithmic tables’. 


Binomial series. For positive integral m, 
fo=A+x"=1+4 (T)s+ ie onset (m= 
If m is not a positive integer, the function f(x) = (1 + x)" can be expanded in a Maclaurin series, 
convergent for |x| < 1. Here f(0) = 1, f’(0) = m, f’(0) = m(m — 1), ..., f™O) = ("| -n! 


This series was discovered by NEWTON in 1676, but derived correctly by EULER only about 100 years 
later. It is useful for calculating approximate value of roots and powers with arbitrary exponents. 
For m = 1/2, 1/3, —1/2 and —1/3, and |x| < 1 one obtains: 


21.2. Power series 491 


1 
Va+x) — 


SSS 
yd —x) 


Powers of the geometric series, that is, negative integral powers of 1/(1 — = es can be obtained 
very simply by differentiation for |x| < 1: 


i —xyH=1txtx? + x84 xfer partes 
1/(i — x)? = 1+ 2x + 3x? + 4x34 +--+ (n+ I)x"+-- 
1/(1 — x)? = 1 + 3x + 6x* + 10x? + --- + (1/2) @+ I+ 2) +- 


Inverse trigonometric and hyperbolic functions. Since the first derivatives of these functions are 

simple algebraic functions, which can be expanded as binomial series, expansions for the functions 
: oe . d(arctan x 

themselves can be obtained by termwise integration. For example, pastel lca ee dx ) = 1/(1 + x?) 
= 1 — x? 4+ x*— x®+.-., |x| < 1, leads to f dx/(1 + x?) = x — x3/3 4+ x9/5— x7/7 +--+, 
and the constant of integration is c = arctan0 = 0. Since arccot x = 2/2 — arctan x, this also 
leads to an expansion for arccot x. Integration of the binomial expansion for -1//(1 —-x?) leads 
to a power series for arcsin x, and also for arccos x = 2/2 — arcsin x. Analogous results hold for 
inverse hyperbolic functions, but not all the resulting series are power series. 


linea Lat, Meee hes, 3---(2n—3)x28-1 
MN kaa aare bo cass te DOr Do r=] 
given yak re oe et ea 1-3-x° Ser Ee 
2 3 2*4+5 
ACARI B eS eT se (I Oa 1) peed, eee) 
arccot x = a/2—x+x3/3—x5/§5+x7/7—+-, r= Be x=]! 
din xe x— Asx abe Adee ee ee + Qn — 3) x0 = ae” 


z = Qn =2)Gn=Ty’ 
-1 

eet + |nQ)—s-5— ggg |) => 
tanh x = x + x3/3 + x9/5 + x7/7 + + x2*8t1/(2n + 1) 4+: 


coth x = 1/x + 1/(3x4) + 1/(Sx8) + 1/x?) + =, 


>» r=] 


Ix] > 


Series for special functions 


Examples for integrals that can be evaluated explicitly only by expansions in series are Gauss’s 
error integral, the cosine and sine integrals, and the logarithmic integral. 


Example: Boe te sine BE ie SPMAne 2 conver gant series, Uy termi ier integtation ct 8 
uniformly Se eae for the integran 


ee eee ee 


Gauss’s error integral, 2 | : 
r= oo, limg(x) = 1 a leit - £3 a1§ 3:7 


t= oo 


: sin if 4 
Sine integral, r = 00 | Si(x) = {3 


Cosine integral, r = co 


492 21. Series of functions 


Approximations 


Examples of applications of Taylor’s theorem. Taylor’s theorem is often used for calculating 
approximate values of a function f(x). The remainder may be used to decide how many terms of 
the series are needed to achieve a prescribed accuracy and to give a bound for the error when a 
fixéd number of terms is used and the variable is confined to a specific range. 

Calculation of the number e. Taylor’s theorem applied to e* at x = 1 gives 
e=1+14+1/2!+---+1/n!+e%/m+ 1)!| O0< &< 1. If e is to be calculated correctly 
to 7 decimal places it can be decided quickly how nm should be chosen to attain the required 
accuracy. The remainder satisfies the inequality 1/( + 1)! < R, < 3/(m+ 1)! because 
e® = 1 < e® < e! < 3. To avoid rounding errors it is best 
to require that R, < 10-8. Then vn must satisfy 3/(” + 1)! 
< 10-® or (n + 1)!>3.108. Since 12! ~ 4.8 - 108 it suffices 
to take m=11. Then 14+ 14+ 1/2! +--+ 1/11! 
= 2.718281826... The remainder term can be estimated 
from 1/12! ~ 2.107? and 3/12! ~ 6.10-°. Thus, one obtains 21.2-3 The number e, base of the 
the inequality 2.718281 828 --- < e < 2.718281832 and e, natural logarithms. 
correct to 7 decimal places, is given by e = 2.7182818... 

If one carries out this calculation, one realizes that not very much effort would be needed to in- 
crease the accuracy. However, one encounters a problem that often arises in practical calculations, 
the problem of rounding errors. The individual terms of a series used for numerical calculation 
are almost always periodic decimals, which have to be truncated and rounded off. This rounding 
off process may make it impossible to predict with absolute certainty the accuracy reached. In 
practice it has been found reliable to carry one or two decimal places 1.000 000 000 
more than required in the final result, and also to note for each term _0.100 000 000 
whether the rounding is downwards or upwards, so that the maximum +0.005 000 000 
possible rounding error can be stated before the final result is rounded —0.000 166 667 
off. To illustrate this point, here is the calculation of e~°:! using $< 
Taylor’s theorem: e~°-' = 1 — 0.1 + 0.005 —--- + R,, where R, = 0.904 833 333 
(0.1)"+1 - e~ 9-9-1 ( + 1)! withO<8<1. Note that 0.905 <e-9°°-1<1. The 

first four terms of the series lead to the adjacent calculation. For R3 the inequality 0.000004 167 > 
R3 > 0.000003770 holds, so that e~°:! satisfies 0.904837103 < e-°:! < 0.904837500. Therefore 
e-9-1 — 0.904837 ... has been determined correct to six places. The most that could have been 
expected from using only four terms, together with the remainder, is accuracy to four or perhaps 
five places; the accuracy of the calculation, with an error of at most four units in the seventh place, 
is surprising. The use of Taylor’s theorem always leads to favourable results when f“*!)(x9) and 
f(x + A) do not differ much and the function f*+!)(x) is monotone in the interval from xo to 
Xo + h. 

Calculation of the number x. The arctan series can be used for the calculation of 2. It reads 
arctan x = x — x3/3 + x3/5 — x’/7 + ---. In this series, the remainder R, cannot be used immedi- 
ately in the form given in Taylor’s theorem. Although the function f(x) = arctan x has derivatives 
of all orders, the way in which successive derivatives are formed is rather complicated and can 
hardly be used to give an explicit formula. However, one can obtain the following formula for the 
remainder by direct use of the mean value theorem: 


arctan x = x — x3/3 + x5/5 — 4 --- + (—1)E! x2#-1/(2k — 1) 
+ (—1)F x?#+1/[(2k + 1) (1 + Ox?)], OC I< 1. 


For x = 1, this yields 
n/4=1— 1/3 + 1/5— +--+ (-1)?*1'/Qk — 1) 4+ (-IF/MIQK+D04+8), OCC. 


This equation was found by James GrEGory (1638-1675) and by LErBNiz and is therefore called 
the Gregory-Leibniz equation. The remainder lies between 1/[2(2k + 1)] and 1/(2k + 1), so that 


21.2. Power series 493 


one can see that the formula is not very suitable for practical calculation of x. One would need 
100000 terms for calculating 2 to 5 decimal places. One therefore tries to find more suitable values 
of the independent variable to substitute in the arctan series. For instance, arctan (1///3) = 7/6 
leads to mn 1 i_ 1 n 1 bed (—1§ 1 0<b<1 
6 3 3-3 5 + 3? (2kK+1)3* 1440)’ ; 
Combination of several arctan series with different arguments ultimately leads to particularly 


convenient formulae, of which some are quoted, which have been used to calculate 7 by means of 
electronic computers: 


GAUSS xz = 48 arctan (1/18) + 32 arctan (1/57) — 20 arctan (1/239); 
STORMER (1896) 2 = 24 arctan (1/8) + 8 arctan (1/57) + 4 arctan (1/239). 


The last two formulae were used in 1961 for the calculation of z to 100265 decimal places. Two 
machines calculated in the binary system, as a check each by a different formula. Gauss’s formula 
required 4 hours and 22 minutes, Stormer’s 8 hours and 43 minutes. The transfer to the decimal 
system took 42 minutes; the print-out occupies 20 pages. 

In practice such ‘accuracy’ is without significance. A knowledge of z to 14 decimal places suffices 
to calculate the circumference of a circle of radius 6400 km (the radius of the earth!) with an error 
of less than 0.001 mm. For such a bound of the error to be meaningful the radius would have to be 
determined with a similar accuracy; but in the present state of measuring technique this cannot 
be achieved by any means. 


The use of the binomial series. The binomial series is often used for approximate calculation of 
roots. From Taylor’s theorem, 


a a a 2 _ a n a n+l a-n-1 
d+ x) =1 +(t)*+(3). + +(*)s +(,4 1) (1 + @x) 
with 0 << #< 1 and |x| < 1. 
3 
For instance, to calculate /999 to 12 places, one has to put in the formula above x = — 1/1000 


3 
and a = 1/3, because #999 = 10(1 — 1/1000)/7. To reach the required accuracy it is enough to 
take nm = 2 because the values of the (7 + 1)th derivative at x9 = 0 and x9 + hk = —0.001 differ 
by very little. One has 1/(1.62 x 10!°) << R2 < 1.003/(1.62 x 10'°) and so 


3 3 
9.996665 556173 < 999 < 9.996665 556175, or /999 = 9.996665 55617... 


This degree of accuracy is hard to attain with many-figure tables, which shows the power of the 
method; greater accuracy would involve comparatively little extra effort. 

If the binomial series is to be used for the approximate calculation of roots, it is necessary to 
bring the argument into the form | + 6, where in general 6 should not exceed 0.1. To do this, it is 


n 
often necessary to use a number of devices which are best illustrated by examples. If /a is to be 
found and if there is an integer 5 such that b” = a, one needs to do no more than to write 


n n n 5 5 

Va = y[6" - (a/6")]) = by [1 + (a — 6")/6"); for instance, 733 = 2 V(1 + 1/32). If this method 
is inappropriate, it may be possible to find a rational number whose nth power is near to a. A clas- 
sical example is /2. Here /2 ~ 1.4 = 7/5, so that one can write 


= Vsm)- 5118) -F Vr) 


3 
Similarly /92 = 4.5 = S , so that one can write 


3 41/93 23.92 9 4// 736 94 7 
y92= \(a-9—) = Vas) =z V(t + a): 


These examples indicate how the binomial series may be used to calculate roots very accurately. 
The series can also be used for any fractional exponent, and is particularly useful when 4-, 5- or 
even 7-place tables do not give enough accuracy. 

Approximations. In rough calculations it has proved advantageous to use the first few terms of 
a power series expansion. There are familiar applications of this in science and technology. To 
neglect the square and higher powers of small quantities is quite common, for example in thermo- 
dynamics, when the cubic coefficient of expansion is set to be three times the linear coefficient. 
Frequently sin x for small angles x is replaced by x. The table that follows gives a survey of the more 
frequently used formulae and their range of validity. The cited values of |x| should not be exceeded 
if the error in using the approximation is not to exceed 0.001 or 0.01. It will be seen that for practical 
purposes it is frequently unnecessary to use tables of function values, since it is then often sufficient 
to keep the error below 0.1% or 1%. In that case it is, for example, permissible to replace arcsin x 


494 21. Series of functions 


by x in the range up to 10°; this leads to a remarkable saving of effort. It is also possible to dispense 
with searching for tables of rare special functions if suitable approximate formulae can be derived 
from expansions in series; approximate calculation of integrals can also often be made from suitable 


approximation formulae quickly and with good accuracy. 


Examples: 


+ 4 // 258.3 
1. 258.3 = Vas 


2.e79-1 ww 1 —0.1 + 


i 


jw a(t + 


pe 


4 x 256 
= 0.905 (exact value 0.90484 ...). 


3, e~ 9-025 ~ 1 — 0.023 = 0.977 (exact value 0.977262 ...). 


Frequently used approximations. The figure at the bottom gives the relation between angles x 
in degrees and x in rad. 


= 4.009. The exact value to five places is 4.00895. 


Function Ist Error < 2nd Error < 
approxima- | 10-5 10-2 approximation 10-3 10-2 
tion for |x| < for |x| < 

1/ + x) l1—x 0.031 0.099 1—x-+ x? 0.096 | 0.20 

1/1 + x)? 1 — 2x 0.018 0.055 1 — 2x + 3x? 0.063 0.12 

1/ + x) 1 — 3x 0.012 0.039 1 — 3x + 6x? 0.046 0.095 

Va +x) 1+ x/2 0.087 | 0.25 |1-+x/2 — x2/8 0.25 | 0.48 

3 

Va +x) i+ x/3 0.095 | 027 |1-+x/3 — x2/9 0.25 | 0.47 

4 

Vl + x) 1+ x/4 0.10 | 029 | 1+x/4 — 3x2/32 0.24 | 0.49 

1/Vd + x) 1 — x/2 0.050 0.15 1 — x/2 + 3x?/8 0.14 0.28 

3 

1/VQ + x) 1 — x/3 0.065 | 0.19 1 — x/3 + 2x?/9 0.17 0.34 

d + x)//U — x) 1 + 2x 0.022 | 0.068 1+ 2x + 2x? 0.077 | 0.16 

(a + x)/ — x)]? 1+ 4x 0.011 0.034 1+ 4x + 8x? 0.043 0.090 

V{d + x)/ — x)] l+x 0.043 | 0.13 1+x + x?/2 0.12 0.25 

sin x x 0.18 0.39 x — x3/6 0.63 1.04 

sin? x 0 0.031 0.10 x? 0.23 0.41 

cos x 1 0.044 0.14 1] — x?/2 0.39 0.70 

cos? x 1 0.031 0.10 1 — x? 0.23 0.42 

tan x ‘ 0.14 |030 | x+ x3/3 0.38 | 0.58 

arcsin x x 0.18 0.38 x+ x3/6 0.42 0.63 

arccos x n/2—x 0.18 0.38 n/2— x — x3/6 0.42 0.63 

arctan x x 0.14 0.31 x — x3/3 0.35 0.57 

arccot x n/2—x 0.14 0.31 n/2—x + x3/3 0.35 0.57 

e l+x 0.044 1013 |1+x+x2/2 0.17 | 0.38 

In(1 + x) x 0.044 0.14 x — x?/2 0.14 0.33 

Ig (1 + x) 0.4343x 0.069 0.23 0.4343x + 0.2171x? 0.20 0.45 

sinh x x 0.18 0.39 x + x3/6 0.65 1.03 

cosh x 1 0.044 0.14 1+ x?/2 0.39 0.70 

tanh x x 0.14 0.31 x — x3/3 0.38 0.61 

sinh-! x x 0.18 0.40 x — x3/6 0.43 0.70 

tanh-! x x 0.14 0.30 x + x3/3 0.37 0.52 

0 1 2 3 4 b 6 7 8 9 10 degrees 


Jeg papa ped pedal ppp font X 
0 Q01 Q02 Q03 204 0.05 Q06 Q07 Q08 Q09 Q10 Of O12 Q13 QM O15 Q% O17 0% rad 


Geometrical applications of Taylor’s theorem 


Osculating parabola. Suppose that a curve has the equation y = ap + a,x + a2x? +--- in 
a given coordinate system. If one introduces a new coordinate system in which the x-axis is tangent 
to the curve at a point P and the y-axis is along the normal at the same point, then for the curve 
referred to the new coordinate system the following values holds at the point P: x = 0, f(x) = 0, 
f(x) = 0. The curvature x is given, in general, by x = f’’/(1 + f’?)°/?, and therefore f’’(0) = x. 
It follows that f(0) = ag = 0, f’(0) = a, = 0, f’"(0) = 2a, = x, and that the equation of the curve 
in the new coordinates is f(x) = (1/2) xx? + ---. The curve near P is therefore approximated well 


21.2. Power series 495 


by the parabola g(x) = (1/2) xx”, the osculating parabola. This is a 


second order approximation (Fig.). 7 
21.2-4 Osculating parabola rs 
for the function F 
y= 1—cosx ys: 
ae a} .— + 
hy 
* 
sy, 
“hy, 
21.2-5 Determination of ks | 
* the length of a circular arc ‘ 


Determination of the length of a circular arc. The arc length s of a circular arc with central angle « 
can be found from the approximation s = (8b — a)/3, where a is the chord of this arc and 6 the 
chord for half the arc (Fig.). The exact value is s = ra. Since a = 2r sin («/2), b = 2r sin («/4), 
the series expansion for sin x leads to 


OO og {s| (a/4)> , (a/4)5 — (@/4)7_ 


7 “Sle a ° ss a a 
x (x/2)> (a/2)9 (x/2)? av 
[5-9 + St 
2r [ 3x 3x3 a’ Of a al cos (ai#/4 
=3|5 -ae tan (e=z- a )| 

8b — a ra? | x? at l ad / 

or ga ea f at (a (cos ar Raa Feiss a) y| 

If the term in square brackets in the last expression er /'/ 
is replaced by 1, the error is increased, so that the Yh 


error in the approximation is at most r«>/7680. For 
r= 1 unit and « = 30° theerror is less than 5- 10-® 
units. 


Bending of a beam. The bending moment M(x) of 
a beam is M(x) = EJx, where E is the module of 
elasticity, J the moment of inertia of the cross section, / 
and » the curvature of the centre line of the beam 
at the position x (Fig.). In practice the angles « is 
very small, and one may expand the denominator in 
the equation x= y’/(1 + y’*)*/* in powers of 
‘2 = tan*«. This gives x = y”“(x) [1 — (3/2) y’(xy? 
+ (15/8) y’(x)* +.---]. For a slightly bent beam, a 
first approximation is x = y’’(x) and one obtains 


2 
the differential equation sais = = for the bend- 


ing of a beam. dx? 


21.2-6 Bending of a beam 


Taylor’s theorem for several variables 


Taylor’s theorem can also be stated for functions of several variables. For a function f(x, y) of 
two variables the case m = 0 reads: 
f(x0 + A, Yo + k) = F(X, Yo) + Af (Xo + Bh, Yo + BK) + kfj(X0 + Bh, Yo + BK), 0< b< 1. 
This is simply the mean value theorem for functions of two variables. The case m = 1 reads, with 
0O< P< l, 
f(Xo + A, Yo + k) =f(X0, Yo) + Af (Xo, Yo) + Kf, (Xo, Yo) 
+ (1/2!) [A7fx(xo0 + OA, Yo + OK) + 2kfey(Xo + Oh, Yo + Ok) + k7f,,(x%o + BA, YotPk)). 
The space required for higher values of ” increases rapidly with n, for instance, n = 5 requires 


28 different terms, and each of the highest derivatives has 6 indices. It is therefore helpful to use a 
shorthand symbolic notation by first writing 


hf (0+ Vo) + Kf(%0,¥o) = (1 ae e) Flxo, Yo) 


) 
and then using powers of the symbolic operator ( —+k >) to write down the terms of higher 
order. 


496 21. Series of functions 


The quantity @ takes the same value in all terms of the remainder, and depends on 7, xo, Yo, h 
and k. The same symbolism can be utilized for functions of three or more variables. For three 
variables one obtains 


n 1 0 0 0 \’ 
f(%0 + Ay Yo + ks 20 + 1) = fo Yos 20) + & (hae tka tls) S(X0s Vos Zo) 
1 


) ) g\" 


An extension of Newton’s method. If one wishes to solve the pair of equations f(x, y) = 0, g(x, y)=0 
and if approximate values x9, Yo are known, one puts f(xo + A, yo + k) = 0, e(%o + A, Wo + kK) = 9, 
and expands by Taylor’s theorem. For n = 0 this gives 

f (Xo; Yo) = hf,(Xo a Bh, Yo se Bk) + kf, (xo 7 Dh, Yo = Bk) oe 0, 
&(X0, Yo) + hex(xo + WA, Yo + WK) + kgy(Xo + Oh, Yo + #k) = 0. 

If one puts # = 0, # = 0 as an approximation, one commits an error, and instead of the exact 
values A and k one only obtains approximate values A, and k, from which new approximations 
X1 = Xo + Ay, ¥1 = Yo + ky Can be calculated. The procedure can be repeated if necessary. For 
hk, and k, one obtains 


hy = —[U2,y ~ Bf) | Ke&y —Sr8xdemxe» ky = (fe. ~~ Hu) Le&y — Seb xlemxe 


to be evaluated at x = xo, y= Yo. 


+ 


Example: To solve the system of equations f(x, y) = x? + y — 2 = 0, g(x, »y) = xy —2=0. - 
Approximate values are x9 = —1.8 and yo = —1.1. For these values one gets A, = 0.031, 
k, = —0.030, and the new approximation is x, = —1.769, y,; = —1.130. 


21.3. Trigonometric series and harmonic analysis 


The development of the theory of trigonometric series began with the publication, in 1822, of 
the book ‘Théorie analytique de la chaleur’ by Joseph DE FouriER (1768-1830). His researches, 
extending over several years, have led to the development of an extensive theory for the series that 
now bear his name and are of great importance in mathematics, science and technology. Its basic 
idea is to represent periodic functions by series of particular (trigonometric) periodic functions. 

To investigate periodic motions Fourier series are used in acoustics, electrodynamics, optics, 
thermodynamics etc. In electrical engineering problems such as the frequency behaviour of switching 
elements or the transfer of impulses can be solved by means of Fourier series. Prediction of the tides 
is important for navigation; since they are periodic phenomena, one utilizes Fourier series and 
constructs mechanical instruments, the tide predictors and waterlevel predictors, for all important 
harbours. Today there is hardly a branch of physics, mathematics, or technology in which Fourier 
series are not used. 


Trigonometric series 


(oe) 
Series of functions ) f,(x) in which the general term is 
n=(0 


F(X) = a, cos nx + 5b, sin nx, with constant coefficients a, and 
b,, are called trigonometric series. If this series converges in 
an interval of length, 27, then, since the trigonometric func- 
tions are periodic, it converges for all x and represents a 
periodic function f(x). But this function is not necessarily 
continuous, indeed, it often has discontinuities between 
which it is given by different formulae (Fig.). On the 
other hand, if the series converges uniformly, then its sum 
f(x) is continuous. In this case a connection can be esta- 2).3-1 Graph of a function represen- 
blished between the coefficients a,, 6, and the sum function table by its Fourier series 


21.3. Trigonometric series and harmonic analysis 497 


f(x). Multiplication of the series 


fx) = ¥ fil) = Sq cos nx + by sin nx) 
n=0 


n=0 


by the bounded factors cos px or sin px, where p is a non-negative integer, does not disturb uniform 
convergence, so that one may calculate 


2% 2 
J f(x) cospxdx and f f(x) sin px dx 
0 0 


by termwise integration of the series )” f,(x) cos px or »' f,(x) sin px. These integrations involve 
the integrals over the interval (0, 27) of the functions cos nx cos px, sin nx Cos px, COS nx sin px, 
sin nx sin px. One finds by partial integration that these integrals have the value 0 when n + p; 
for p = n they are 


2% 2% 

J cos? nx dx = f sin? nxdx =a for n> 0, 
d 0 0 

an 2% 27 

J cos? nxdx=2n, fsin?nxdx=0 for n=0. 
6 6 


Because of the exceptional behaviour of m = 0, it has now become conventional to write the trigono- 
metric series as 


f(x) = 3/249 + (a, cos nx + 5b, sin nx) 
n=l] 


so that the co- 
efficients can be 
written for all 
n => 0 thus: 


Fourier series. One may well ask what functions f(x) can be represented by trigonometric series. 
If f(x) is integrable, one can at least use the Euler-Fourier formulae to calculate the numbers a, 


and 5, and then write down the formal series '/.a9 + D, (a, cos nx + 5, sin nx). 
=1 


n 

One calls this the Fourier series of f(x), and a,, b, the Fourier coefficients of the function f(x). 
However, it may happen either that the Fourier series of f(x) does not converge at all, or that it 
converges, but that its sum is not equal to f(x); this can occur even if f(x) is continuous. Also it is 
conceivable that f(x) has other representations by a trigonometric series. 

However, if the Fourier series of a continuous function f(x) turns out to be uniformly convergent, 
then its sum must be f(x), and f(x) has no other representation by a uniformly convergent trigono- 
metric series. This is only a sufficient condition: the problem of finding necessary and sufficient 
conditions for the convergence of the Fourier series of f(x) is still not completely settled. 

Since the terms f,(x) of the Fourier series are periodic functions of period 27, the sum function 
also has the period 22, so that it makes sense to consider Fourier series for periodic functions of 
period 27. From a function of period 2/, one can obtain a function of period 27 by replacing the 
variable x by 2x/l/. If one needs the Fourier expansion for a function f(x) defined in some interval J 
of length 27, it is appropriate to extend the function outside this interval by requiring that 
f(x + 2kx) = f(x) (x ET; k an integer) so that the extended function has the period 22. 


Dirichlet’s condition. A further sufficient condition for the convergence to f(x) of the Fourier 
series of f(x) is due to DIRICHLET (1805-1859). It suffices for practical purposes and covers 
a wide class of functions including functions of the type described in the picture below. The func- 
tions f(x) may, without loss of generality, be assumed to be periodic of period 22. 


Suppose that f(x) is a periodic function of period 22 and is definedand bounded for 0 = x < 27, and 
suppose that the interval (0, 2:r) can be split into finitely many subintervals in each of which the func- 
tion is continuous and monotonic. Then the Fourier series of f(x) converges at each point of continuity xp 
to f(xo), and at a point a jump discontinuity x to the mean value (1/2)[ lim /f(x)+ lim /(x)] 
of its left and right limiting values. x—+x*—0 z—x0 +0 


Hence, if one prescribes at the points of ‘jump’ x* with 
lim f(x) lim f(x) that f(x*)=(1/2)[ lim f(~)+ lim f()], 
0 x—+x*+0 x—+x*-0 x—+x*+0 


x7 xF- 
then the Fourier series of f(x) converges to f(x) at all points in the domain of definition. The require- 
ment that the interval (0, 27) can be split into finitely many subintervals in each of which f(x) is 
continuous and monotonic means that the function has only finitely many discontinuities and only 
finitely many extrema. 


498 21. Series of functions 


Example: Let f(x) be given by f(x) = 1 for0 < x < a, f(x) 
= —1 forrm< x< 2n, f(x + 2kx) = f(x), k= £1, +2, +3, ... 
At the jumps, let ((0) = f(ka) = 0. Dirichlet’s condition 
obviously holds (Fig.). 

The integrations in the Euler-Fourier formulae lead to a, —0 
for all m,b2,=—O0(n=1,2,..-), bang, = 4/[(2n + 1)) if 


n= 0,1, 2,... ‘ j 
The Fourier series is f(x) =— | sin x +> mn ae |, 


ay 5 


21.3-2 Rectangular curve 


1. Rectangular impulse of the first kind: 
f(xy) = — 2 ‘sin x + Se Ne fe emer Pe ee |. 
me 1 3 5 
2. Rectangular impulse of the second kind: 
c sinc sin 2 


2a c 
f= 2 [5+ 1 "COS X + —>— cos 2x + — — cos 3x + =], 


3 4 
4a cos 3x cos 5x 

3. Rectangular curve: f (2/2) = f(32/2) =--- =0, f(x) = = [608 ¥ — 3 — +. 5 feel, 

4. Sawtooth curve: f(0) = f(2m) = --- = 0, f(x) = -% sin % a” die mo ai | ; 
5 6 

2 2 om 
; in 5 

5. Triangular curve: f(x) = s | = = —S = ee | . 


6. Triangular impulse: 


ac 2a [{1— cose 1 — cos 2c 1 — cos 3c 
f%) = = =| 12 cos x + ——————— cos 2x ++ ————— 


21.3. Trigonometric series and harmonic analysis 499 


7. Alternating current rectified in one direction, half waves of a cosine curve: 
a nN 2 2 2 
{x)= <li + > cos x + Ta cos 2x — a5 cos 4x + ra cos 6x — + vl, 
8. Alternating current rectified in two directions: 
2a 2 
f(x) = |cos x|, f(x) = — 1 Bar er 


-cos 2x — -cos 4x -+ 


2 2 
375 577 cos 6x — fo], 


Harmonic analysis and harmonic synthesis 


Harmonic analysis. This is the determination of the Fourier coefficients ag, a;, @2, .-., b;, b2,... 
In technology it is frequently used to analyse periodic phenomena. An oscillation is split up by 
harmonic analysis into a sum of pure sine oscillations (harmonic oscillations) and a constant part. 
Apart from the fundamental oscillation there occur the so-called ‘harmonics’ whose frequency is 
twice, three times etc. the fundamental frequency. As a rule, the phase of an individual harmonic 
is shifted by comparison with the fundamental oscillation. One can always set a, cos nx + b, sin nx 
= c, COS (nx — x,); this leads to a, = c, cos x, and b, = c, sin x, and hence to c, = (a2 + 52), 
xX, = arctan (b,/a,). 

The process of setting up the Fourier coefficients for the rectangular curve is an example of 
harmonic analysis. 

Much labour can be saved in harmonic analysis if one observes certain symmetry properties of 
the function f(x) to be analysed: 


In the Fourier expansion of an even function f(x) = f(—.x) all the sine terms are absent, that is, 


all the b, = 0. For an odd function f(x) = —f(— x) all the cosine terms are absent, that is, all the 
a, = 0 (including aj). For a function with the property f(x +- 2) = —f(x) the absolute term is 
a) = 0, and only coefficients with an odd index occur (az = ag = --- = bz = bg = +--+ = 0). 


If one looks for a best possible approximation to a periodic function f(x) by a finite sum ®,(x) 


n 
of sine and cosine functions, ®,(x) = D(a, cos jx + by, sin jx), one chooses by analogy to the method 
2x j=0 


7% 


of least squares the integral sz | Lf(x) — ®,(x)]* dx as a measure for the difference f(x) — ®,(x). 


0 
This assumes its minimum when the a, and 4, are the Fourier coefficients of the function f(x). This 
is another important property of the Fourier coefficients. 


Harmonic synthesis. This is the inverse process to harmonic analysis. The individual pure oscil- 
lations are added and yield a resultant. The figure shows the first three terms of the Fourier ex- 
pansion of the rectangular curve and the sum curve y allows a comparison with the original curve yp. 


21.3-4 Graphical 
representation of the har- 
monic synthesis of a 
rectangular curve 


Approximate calculation of the Fourier coefficients. In practice the functions to be expanded in 
a Fourier series are frequently not given by an analytic expression. As a rule they are curves drawn 
by a measuring instrument equipped with a pen, such as the tangential force diagram of a piston 


500 22. Ordinary differential equations 


engine, the diagram for the distribution of pressure in a pump, the recording of mechanical or 
electrical oscillations etc. In these cases the Fourier analysis is also possible. The integrals in the 
Euler-Fourier formulae are then calculated approximately. For this purpose the interval is divided 
into a large number 2m of equal parts (Fig.). It is advantageous to choose the number of parts as a 
multiple of 4 and to use the values 12, 24, 36, 72, ..., because such a division makes it possible to 
utilize the symmetry properties of the sine and cosine functions, and this saves calculating labour. 
After fixing a coordinate system the function values at the places xo, X;, X2, ---» X2m_1 are measured; 
they are denoted by yo, ¥1, ¥25 ---» Yom_1- Then 


1 me 1 2m—1 
70 Om 126 Visom = Fe * yi COS (i7), 
-1 
a,-1”: “~y cos a 
mi m 
] 2” _ 


nin 
6, = — * sin —— 
n mj "v1 m 


for n= 12 »eeey (m — 1). 


21.3-5 Fourier analysis of a curve given empi- 
rically 


If one chooses 2m = 24, one obtains the 24 coefficients ag, a,, a2, ...,@12, 6,, b2, ..-, b,,. The 


11 
resulting function do + a (a, cos nx + 6, sin nx) + a,2 cos 12x = f(x) has the values f(x,) = »,; 


=1 
at the places x, (i = 0, 1. , 23). 

The amount of work to ‘be done in harmonic analysis is considerable. With the help of an electric 
calculating machine and special techniques a trained operator can carry out a harmonic analysis 
with 12 points in about half an hour, with 24 points in about 2 hours, with 36 points in about 6 hours 
and with 72 points in about 16 hours. Without resorting to the special techniques one has to form 
for 72 points about 5000 products, which have to be combined in 72 sums. An electric computer 
of medium speed performs the calculations for 36 points in about 2 minutes. The time required to 
print out the result is usually larger than the calculating time. 


Harmonic analysers. The large amount of time required for the Fourier analysis of curves has 
led to the development of mechanical tools and devices. One operates with them as with a planimeter. 
The given curve is traced with a moving pen, and the value of a Fourier coefficient or a value propor- 
tional to it can be read off the calculating works. Instruments of this kind are called harmonic 
analysers. 


22. Ordinary differential equations 


22.1. Preliminary survey............... 501 Integration of an arbitrary differential 
Basic conceptS .......cccccccceees 501 equation of the first order ......... 508 
Differential equations and geometry . 502 Linear differential equations of higher 

22.2. Elementarily integrable types ...... 505 OVE... cc ccc cece ee cee eces 509 
Special types of elementarily integra- 22.3. Further considerations ........... $12 
ble equations of the first order ..... 505 Integration procedures in practice .. 512 

Glances at the theory ............. 514 


Many problems of higher analysis presuppose a knowledge of ordinary differential equations; 
for example, problems of potential theory, of the calculus of variations, of theoretical physics and 
of partial differential equations (see Chapter 37.). Beyond this, a wide field of applications is opened 
up by ordinary differential equations; for example, the calculation of pendulum oscillations, satel- 
lite trajectories, load carrying wings, dams, earthquake tremors, heat propagation, speeds of 
chemical reactions and of radioactive decay, as well as calculations in electrotechnology and ship 
building. Only differential equations for real variables and real-valued functions will be examined 
here and, renouncing full mathematical rigour, methods of solution will be given that occur fre- 
quently in practice. A first glance will also be given at typical problems in this field, at its vast and 
often difficult theory. 


22.1. Preliminary survey 501 


22.1. Preliminary survey 


Basic concepts 


Differential equation. If a relation exists between a function of one or more variables and some 
of its derivatives, in the form of an equation in which the independent variables can also occur, 
then one speaks of a differential equation. Every solution of the differential equation is called a 


. . . dy \? . 
solution or an integral; for example, the differential equation (> + y? =1 has the solution 


y = sin x, since substitution gives the identity cos? x + sin? x = 1, which holds for all x. Con- 
versely, for a function z = f(x, y) of the two independent variables x and y one can set up a dif- 


: . : Oz 0z : ; 
ferential equation that has, for example, the solution z = xy. Because ——- = y, —- = x _ in this 
, ; . Oz dz dx dy 
case, z = xy Satisfies the differential equation rad + ay X= x? + y?. 


If the functions occurring in the differential equation depend on only one independent variable, 
and thus also derivatives with respect to only one variable occur, then one speaks of an ordinary 
differential equation. To these belong, for example, 

dy d7y ; 
— = cos x, = 3 "3 _ y’xy = 0. 
dx dx? +y xy, y yxy 


On the other hand, if the required functions depend on several independent variables and ac- 
cordingly partial derivatives occur, one speaks of partial differential equations. Examples are 


2 2 2 
0*z te ee 3G and 0°z 0°z 0z 
dx Oy 


ax ax? t ay2 PY By! 
it is required to find functions z = f(x, y) of x and y. Only ordinary differential equations will be 
dealt with in the following. 

Order and degree of a differential equation. The order of a differential equation is defined as 
the highest order of the derivatives contained in it. A differential equation of the nth order can be 
expressed in the form F(x, y, y’, y’”’, .... y™) = 0, where F denotes a function of the arguments in 
the bracket. In particular, y’ = f(x, y) is the 
general explicit and F(x, y, y’) = 0 the ge- 
neral implicit differential equation of the first 
order. If F is a polynomial function of the 
arguments y, y’,.... y, then its degree is 


Differential equation 


y =x-+siny’ 


ee 2 
: : = x sin 
equal to that of the differential equation; the y am 3x2y > i] 
dependence upon x plays no part in this. ff 4. yf cos x = sin x mn 
However, in the case of the differential equa- ann a rc 2 
tion y’ = x + sin y’ one cannot speak of a . 


degree. 

Differential equations of the first degree, or linear differential equations, are particularly important 
for applications. In these the unknown function and its derivatives occur only to the first power 
and also not multiplied together. Consequently, the general linear differential equation of the mth 
order has the form f+ foy+fiy + fry’ +--+ fy = 0, where f, fo, fi,---.f, denote given 
functions of x. 

The integral of a differential equation. If the equation F(x, y, y’, ..., y™) = 0, after the substitution 
of a function y = 9(x) and its derivatives y’, y’’, ..., y, becomes an identity in x valid for all x 
in an interval, then y = g(x) is called a solution or integral; the process of obtaining it is called 
integration, and the graph of y = g(x) in the x, y-plane is an integral curve. The solutions are often 
not elementary functions or even closed forms of these functions. On the contrary, certain non- 
elementary functions that are important for applications are defined precisely as solutions of 
special types of differential equations. For example, in 1785 in investigating the force of 
attraction of an ellipsoid at a point outside it, LEGENDRE hit upon a differential equation still called 
after him, whose solution is represented by the Legendre polynomials. It is often sufficient, 
without insisting on the complete solution, to determine the analytic properties of a solution 
in a neighbourhood of a point x9 and to investigate the shape of the integral curves, the uniqueness 
of the solution, or other questions. Finally, existence theorems are concerned with the properties 
of a differential equation from which it can be deduced with certainty that solutions exist at all. 


Preliminary survey of the nature of the integrals of differential equations. One distinguishes between 
the general integral and particular and singular integrals. The nature of the solution can be sum- 
marized crudely and somewhat imprecisely as follows: 


The general integral of a differential equation of the nth order contains exactly n arbitrary constants 
C,, C2, ..-, C,; it is determined only to within these constants. 


502 22. Ordinary differential equations 


Correspondingly, in the integral calculus one obtains as solution of the differential equation 
y’ =f (x) the integral y = f f(x) dx + C. If one assigns to the C;, C2,...,C, arbitrary fixed numeri- 
cal values, then one obtains a particular integral. Consequently, all the particular integrals are, so 
to speak, contained in the general integral. 


Example: The differential equation y’? + »y? = 1 has » = sin (x + C) as its general integral. 
For C = 2/2 one obtains the particular integral y = sin (x + 2/2) = cos x. By substituting in 
the differential equation one can easily see that it is, in fact, a solution. 


Besides the general integral and particular integrals, a differential equation may also have singular 
integrals, which usually correspond to certain discontinuities of the given equation. Singular integrals 
cannot be obtained from the general integral by a choice of the constants. For example, the dif- 
ferential equation y’? + y? = 1 already mentioned has the singular integral y= +1, as one can 
see by differentiation and substitution. 


Example: The second order differential equation y” +y=0 has the general integral 
y = C, sin x + C; cos x. By suitable choice of the constants C, and C3 one obtains the particular 
integrals y = 0, y = cos x, y = 2cos x, y = sin x, y = 7 s1n x. Singular integrals do not exist. 


Differential equations and geometry 


The direction field of a differential equation of the first order. In the implicit form F(x, y, y’) = 0, 
and more particularly in the explicit form y’ = f(x, y), a differential equation assigns to the points 
of the x, y-plane for which f(x, y) is defined, a value p = y’ = f(x, y) of the derivative of the required 
function y(x), which gives the direction of the tangent to the curve representing the function y(x). 
The direction field of the differential equation of the first order arises in this way. The number triple 
x, y, p is called a line element; the point (x, y) is its carrier. At least an approximate idea of the course 
of the integral curves of a first order differential equation can be obtained with the help of the direc- 
tion field, in which the direction of the tangent at the point (x, y) is marked by a short line (Fig.). 
Geometrically expressed, the problem of the integration of the differential equation of the first 
order consists of finding all curves that fit the direction field, that is, have a tangent at every point 
and contain only those line elements that agree with the values given by y’ = f(x, y). 


22.1-1 Direction field of the differential equa- 22.1-2 Direction field of the differential equa- 
tion y’ = y/x tion y’ = —x/y 


Differential equation and family of curves. The result stated above, that the solution of a differential 
equation of the first order contains one arbitrary constant, can be interpreted geometrically; the 
solution consists of a one-parameter family of curves. The converse also holds: A one-parameter 
family of curves y = 9(x, C) is represented analytically by a differential equation of the first order. 
This is obtained by eliminating C from the system of equations y = (x, C); & = y’(x, C). 

Example: The family of all straight lines through the origin has the equation y = Cx. Then 
y’ = C. From this the differential equation y = y»’x, or y’ = y/x is obtained (see Fig. 22.1-1). 


An a-parameter family of curves can be represented analytically by a differential equation of the 
nth order. Conversely, the general solution of a differential equation of the ath order represents an 
n-parameter family of curves. 


22.1. Preliminary survey 503 


c — eee teens 
‘ 2 
‘ ! 7 
+ { 
' 
1 r 
t | 1 


22.1-3 Direction field of the differential 
equation y’ = x 


22.1-4 Direction field of the differential equation ae A 
y=uxty 


The second part of this theorem clearly follows immediately from the nature of the general 
integral of a differential equation of the nth order. On the other hand, from the equation of a family 
of curves that contains parameters, the corresponding differential equation can be found: one 
differentiates the equation of the family sufficiently often until one succeeds in eliminating the 
parameters from the original equation and the equations obtained from it by differentiation, and 
in obtaining a differential equation free of parameters. 


Example I: y = C,x + C, is the two-parameter equation of the family of all straight lines in 
the plane not parallel to the y-axis. By differentiating twice, y”’ = 0; an elimination is not neces- 
sary. The differential equation states, in fact, that it is concerned with all curves whose curvature 
is everywhere zero; these are precisely the straight lines. 

Example 2: The family of all circles of fixed radius a has the equation (x — C,)? + (y— C,)? =a’. 
By differentiation one obtains x = C, + (y — C,) »’ =0, and a second differentiation gives 
1 + yp’? + (y — C2) y” = 0. From these one obtains by elimination C, = (1/y”) (1 + »’* + yy") 


and C, a (1 + y’?)(>’/y’), and then by substitution the differential equation y’’*a* 
= (1+ y'*)°. 


All curves of the family are contained among the solutions of the corresponding differential 
equation. However, it may very well happen that the solutions of the differential equation contain 
additional curves that do not belong to the original family; for example, the family of curves 
y= Cx am C2, C, > 0, consisting of all straight lines with positive slope, leads to the differential 
equation y”’ =0; but among the solutions of this equation are not only all straight lines with 
positive slope, but also all those with negative slope. 


Singular solutions, envelopes of families of curves. The family of all circles y?, - (x —C)=1 
of radius 1 whose rar lie on the x-axis satisfies the differential equation y*y’? + y? —1=0, 
because yy’ + x —C=0,C =x + yy’ (Fig.). It is also satisfied by the functions y = 1 and y= —1, 


22.1-5 The family of all circles of radius 1 22.1-6 Composite solution curve which fits the direction 
whose centres lie on the x-axis field of the differential equation y*y’* + y?—1 = 0 


504 22. Ordinary differentia] equations 


which are not contained in the general integral (x — C)? + y? = 1, but represent singular solutions. 
Geometrically they are the tangents to the family of circles and fit the direction field given by the 
differential equation, even though they are not contained in the family of circles. From the line 
elements of the direction field additional curves can be constructed that likewise represent solutions. 
One of these infinitely many curves is drawn in red in the figure. 

Family of tangents to the parabola y = x?(Fig.). The 


zis i903 338d S20E3sae08 {gaze S22HT seat UBEs: (BERETS * equation of the tangent to the parabola y = x? at the 
\ He ay ae ee ve point (x1, Yo) is y + Yo = 2xXq. Because yo = x? and xo 

Ege NB EH By Be t / - can be regarded as parameter C, one obtains the equation 
~ y= 2Cx — C? for the family. From y’ = 2C, C = y’/2, 
{ the differential equation of the family is given by y = xy’ 
— — y’2/4, The envelope of this family, which touches every 
- curve of the family, is clearly the parabola y = x? itself. 

- + It is not contained in the general solution y = 2Cx — C? 
"~~ of the differential equation y= xy’ — y’?/4, but it satisfies 
-. it and is the singular solution of the differential equation. 


eat 


Fr 

bt 
rat 

koe 

" 

f 


= oe Hast : 
age ae 


bast ~ The envelope of a family of curves is always a solution 
H es | of the differential equation of the family. 


aha 
a 
+ 
' 
* 
Ht 
1-4-6 
ihe eae 
" 
ee ae i ‘ 
cae a a 
oo f i 
of ee 
nn mane 
F oe ha eee 
ead ‘ml 
=< ae 
iH HE 
i - ( Fee es cae 
ae Peer ee ee 3 : + Hit see 
: btot : \ kh 4 # 
rah ie el hi err We am ys +4 is : 
+ x 7 t 4 bon aa a+eh done ee Sess " 
) i en Fa Tey Peete pete te 
a. Ba & | e Li r J 


Ce a 


bola y = x? 


From this fact follows also a method, stated here without proof, of finding the envelope of a 
one-parameter family of curves when it exists. If one knows the general solution P(x, y, C) = 0 
of the differential equation corresponding to the family, then one eliminates the parameter C from 
d(x, y, C) 

dC 

pp py) partially with respect to C. Occasionally 

pV br this procedure also yields other curves, 
besides the envelope, that have geo- 
metrical significance for the family of 
curves; for example, the cusp locus for 
the family of cycloids (Fig.) or the node 
locus in the case in which the individual 
curves of the family intersect themsel- 
ves 


ra y, 3 | 
eee ~  Isoclines, orthogonal _ trajectories. 
r Std ~ Points of a direction field of a diffe- 
md NS - rential equation of the first order having 


HB . - - the same field direction lie on a curve 
eee _ called an isocline. The equation of an 
i. HEE 1S ATE Se isocline is obtained by substituting y’ 


= constant = a in the equation y’ 

\ _ =f (x, y). From the isoclines one can 

obtain a picture of the direction field 
|. and hence of the solution curves of a 
[i i : pact \) | . differential equation; for example, the 
i FR RRR TTA fdr Ade -s isoclines with y’ = a of the differential 


the equation D(x, y,C) = 0 and the equation = 0 obtained by differentiating it 


trode 


— 


+ : 
bee pete 
i 
' 


Ey 


b 
\, 
4 


oe 


ep) equation (x + y)y’ + x— y = 0 satisfy 

UDB GBRSS is ig HPT - the equation y= x for a=0, x=0 

s aees Ss Aaa We - for a=1, y= —x for a=oo and 

. y = 0 for a = —1. The solution 
22.1-9 Solution curves of the differential equation curves form a _ so-called vortex 


(x + y) y’ + x — y = 0, obtained by the method of isoclines (Fig.). 


22.2. Elementarily integrable types 505 


In geometry and above all in physics the problem often 
arises of finding the family of orthogonal trajectories of 
a family of curves. These are the curves that cut every 
curve of the first family at right angles. The lines of 
force of a magnetic or electric dipole form the family of 
field lines that are cut orthogonally by the equipo- 
tential lines (Fig.). Analytically one obtains the differ- 
ential equation of the orthogonal trajectories by replacing 
y’ by —1/y’ in the differential equation y’ = f(x, y) belong- 
ing to the family of curves g(x, y, C) = 0. This method 
is based on the fact that product of the slopes of two 
orthogonal curves is —1. 


22.1-10 Lines of force of a dipole cutting the equipotential lines at right angles. One family of curves 
consists of the orthogonal trajectories of the other 


Example: The orthogonal trajectories of the family of parabolas y* = —2(x +- C), whose dif- 
ferential equation is yy’ = —1, satisfy the differential equation »y’ = » with the general solution 
y = Ce*. The family of exponential curves forms the family of orthogonal trajectories of the 
family of parabolas, and vice versa. 


22.2. Elementarily integrable types 


A differential equation is called elementarily integrable if its general solution can be obtained by 
ordinary integrations (quadratures) as a combination of finitely many elementary functions. This 
is possible only for certain types of differential equations that occur frequently in applications. By 
the solution procedures dealt with in the following, the question of the existence of solutions is 
decided positively by actually giving them. 


Special types of elementarily integrable differential equations of the first order 
The general implicit differential equation of the first order F(x, y, y’.) = 0 can be solved for y’ 


: oa : : OF 
in the neighbourhood of a point (Xo, ¥o, Yo) by the implicit function theorem, provided that by" + 0 
at that point; one obtains the explicit form y’ = f(x, y). 


Differential equation of the type y’ = g(x). In this type of differential equation the right-hand 


side depends only on x. If g(x) is integrable in the open interval (a, 5), for example, if it is continuous, 
then for an arbitrary but fixed é in the interval (a, 5), the functions 


y= [edr+ C, acx<b, 


satisfy the differential equation y’ = g(x) for 
arbitrary values of the constant C. The integral 
calculus shows that these are all the functions that 
satisfy it; thus, y represents the general integral. 


Differential equation of the type y’ = A(y). If the 
function A(y) depending on y only is continuous 
for c < y < dand is nowhere equal to zero in this 
open interval, then this differential equation can 
be reduced to the type just considered. If y = y(x) 
is a solution of the differential equation y’ = h(y), 
then the inverse function x = y(y) Satisfies the 


de El ein = a ts a 
differential equation y’ = 7 Then . eh (PG rat a Eee LS ES iy SER ERE 


dy ~ h(y) if pore a rere = ae ina << si 
x= i a dy, in an interval for the inverse func- ~~~ HERE BES EL aR ne Se 
tion corresponding to the interval (c,d), yields nae al field of the differential equation 
y = I1/y 


the function x = p(y) that is inverse to the solution 
y = yx). 

Example: The differential equation »y’ = 1/y has a solution in the interval 0< ¢< y < d, 

because all the conditions are satisfied with h(y) = 1/y. Its isoclines are lines parallel to the x-axis. 


506 22. Ordinary differential equations 


The integral curve of a = y passing through the point (6,7) with 7 > 0 in the strip 
¥ 
{—co < x< + .00;¢< y< d} is obtained by solving for y from x = & + | vay = & + 4/,(y? — n*) 


for y > 0. This gives y = V{y? + 2(x — &)} for x > & — n?/2. As was to be expected from the 
direction field, the integral curves are parabolas (Fig.). 


Differential is er with variables separable. In the differential equations y’ = e* sin y, 
y =)/x*?, y¥ = (y+ 1)/(x — 1), the right-hand side depends on both variables x and y, but in a 
particular way; it is the product of two functions, one of which, g(x), depends only on x, and the 
other, h(y), depends only on y. This is not always the case, as is shown, for example, by y’ = sin (xy) 
or y’ = x-+ y. If the right-hand side of the differential equation y’ = f(x, y) can be written as a 
product g(x): A(y), the variables are said to be separable. In this case the differential equation 
y’ = g(x) A(y) can be solved easily, if g(x) and A(y) are continuous functions and A(y) is different 


from zero in a whole interval (c, d). From | = g(x) A(y) one obtains dy/h(y) = g(x) dx, and after 
integrating both sides dx 


Jf dy/A(y) = J g(x) dx + C; 
the general solution in c << » < d is obtained by solving for y. 


Example 1: y’ = —y/x for x >0, y > 0. p 
Here g(x) = —1/x, A(y) = y. One obtains 
fdyly =—Jdx/x+C, 
Iny+Inx=C, 
In xy =C, 
xy = et = ¢. 


The integral curves are hyperbolas. 


22.2-2 Decrease of the pressure p in the atmosphere 
at constant temperature as a function of the distance A 
in km above the ground 


Example 2: The atmospheric pressure p varies with the height A above the ground (Fig.). For 
an increase in height dé, p increases by dp = —og dh, where o is the density of the atmosphere 
and g the acceleration due to gravity. By Boyle's law the ratio o/p = 20/Po =a is constant, and 


hence 
dp = —pagdh, fdp/p = —Jagdh+C, 
In p = —agh +- C. | 


For 4 = 0 the atmospheric pressure is po, the pressure at ground level, so that C = In po. One 
therefore obtains In (p/Po) = —agh, or p = poe™*** = poe corhlPo, The pressure decreases ex- 
ponentially with increasing height; assuming a uniform atmospheric temperature, it is reduced 
by half every 5.54 km approximately. 


Homogeneous differential equation. A differential equation y’ = f(x, y) is called homogeneous 
if f(x, y) is a function y(y/x) of the quotient y/x; for example, y’ = sin (y/x), y’ = (y/x — 1)x aad 
y’ = —x?/y?. To solve the equation one introduces a new variable into the equation y’ = 9(y/x) 


by the substitution »/x = t. Then y = tx, y = & = t’x + t. This leads to the differential equation 
t’x + t= 9(t), or 2 _ wont , in which the variables are separable. The general integral is 
dt 


t)—t 
By solving this for ¢t one obtains t = f(x) and from this the required function y = y(x). The method 
fails if the denominator (g(t) — t) of the integrand vanishes, if y(t) = t, that is, if the given equation 
is y’' = y/x. In this case, however, it could have been treated in the first place as a differential equation 
with variables separable. 


Example: In order to find all curves y(x) that cut every radius vector at the same angle «, one 
selects such a vector making an angle » with the x-axis, At its intersection with the required curve 


=Inx+C 


22.2. Elementarily integrable types 507 


y(x) the slope of the tangent to the curve is 


et) _ tang+tana — (y/x)+ tana 
y =tan(e+o«)= l—tangtan« 1 —(y»/x)tana ° 
2,8 = j Pa * en. # a + (y/x) 
W a=a, ins the differential oP 
riting tan « =a, one obtains erential equation y 1 — a: (y/x) 


which is homogeneous and has the solution (2/a) arctan y/x+-C = In (x? +- y?). 
In polar coordinates r= \/(x* + y*) and » = arctan y/x the solution 
has the equation » = a Inr — (a/2)- C or r = ev/a+Ci2, 
The required curves are equiangular spirals (Fig.). 


22.2-3 The derivation of the differential equation of an equiangular spiral 


Linear differential equation y’ + p(x) y + q(x) = 0. In this equation p(x) and q(x) are given 
functions of x that are assumed to be continuous. It is called a linear homogeneous differential 
equation if q(x) = 0, that is, if there is no term not involving » or y’. It can be solved by treat- 


ing it as a differential equation with variables separable. From = + p(x) y = 0 one obtains 


In y = —f p(x) dx +c, or y= ce J? 4 To obtain from this a solution of the original in- 


homogeneous differential equation y’ + p(x) y + q(x) =0 one uses the method of variation of 
the parameter, due to LAGRANGE. One regards C not as a constant, but as a function of x, 


C= CQ). From y = C(x) J” * = C(x) yx) one obtains y’ = CG) v(x) + Clx) v(x) and by 


substituting in the inhomogeneous equation 
Cy+q+ Cly’ + py] =0. 


The expression inside the square bracket is zero, because yp satisfies the homogeneous equation. 
Thus, one obtains the differential equation C’(x) p(x) + q(x) = 0 for the determination of C(x). 


From this one obtains C(x) = C, — f q(x) of Pt) ds ay 


One should not remember the final formula, but the method: setting up the homogeneous equation, 
separation of the variables, variation of the parameter. 


Example: xy’ — y = x* cos x, that is, »y’ — y/x — xcosx =0; x+0. The homogeneous 
equation y’ —(l/x)y=0 has the solution y= Cx. Variation of the parameter gives 
y’ = C(x) x + C(x) and substitution in the given equation gives 


C’(x) x + C(x) — C(x) — x cos x = 0, Hence the general solution is 
C(x) — cos x = (}, y= xsin x + C,x, as one can verify by 
C(x) =sinx + C,. substitution. 


The Bernoulli differential equation y’ + p(x) y + q(x) y" = 0. This equation is called after 
the brothers Jakob and Johann BERNOULLI, who occupied themselves with it in 1695 and 1697 in 
competition with LEIBNIZ. For 2 = 0 this differential equation is linear; for nm = 1 the variables 
are separable. One can therefore assume that 2+ 0, 2+ 1, and also that y + 0, for example, 
y > 0. Finally, the functions p(x) and q(x) are assumed to be continuous in an interval a << x 
< b. The following differential equations are of this kind: 

y — (x? + 1) y — y? = 0 with n = 2, p(x) = —x? — 1, g(x) = —-1 

or xy’ — y2Inx + y = 0 with 2 = 3, p(x) = 1/x, g(x) = —(in x)/x. 
To solve the equation one introduces a new function z = z(x) by means of the substitution y 
= z1/-") One obtains 

y = 1 — an): 27E-927(x), 
and by substitution in the given equation, a linear differential equation for z(x) 

2+ (1 — a”) p(ixy)z+(1—n qx) =0. 
From the general integral z(x) of this equation the solution y = y(x) of the given equation can be 
obtained. 

Example: In the equation y’ — 4y/x — x /y = 0 one has m = 1/2, x + 0, y > O. Substituting 
y = 2'/O-1/2) = 2? from which »’ = 2zz’, one obtains the linear differential equation 


508 22. Ordinary differential equations 


2’ — (2z)/x —x/2=0 with the general solution z = x?['/, In x + C]. The integral of the given 
equation is then y = z? = x“['/, Inx + C)?. 


Integration of an arbitrary differential equation of the first order 


Every differential equation of the first order can be integrated if the functions contained in it 
satisfy certain conditions, for example, concerning continuity, which are stated more precisely 
in the existence theorems. It will be assumed that all the operations to be performed in the 
following, such as solution of given implicit functions, differentiation, integration, formation of 
the inverse function, etc. are possible. 


Exact differential equation. The explicit differential equation of the first order y’ = f(x, y) can 
be expressed in the form y’ = —A(x, y)/g(x, y) or, in order to avoid fractions, in the form 
y’a(x, y) + h(x, y) = 0. 
If the left-hand side is the perfect derivative of a function F(x, y), that is, if y’g(x, y) + A(x, y) 


= Fe, y), the differential equation is said to be exact. The equation can then be integrated 


easily. Fron are: y) = 0 it follows that F(x, y) = C, and the general integral y = p(x, C) 


dx 
is obtained by solving for y. If A(x, y) + g(x,y) = < F(x, y) = ory) + StOs)) is a 
perfect derivative, then one has necessarily ad 0 
OF OF 
07F 07F : aoe ; ea oe cup 
From —~—~—— = =——— one obtains the condition of integrability, a necessary and sufficient 


Ox Oy dy Ox 
condition for the differential equation y’g(x, y) + A(x, y) = 0 to be exact. 


Example: The equation y’(6xy + x? + 3) + 3y? + 2xy + 2x =0 is given. Here g(x, y) 


og dh 
Fee orgs + 3, =~ = 6y + 2x; h(x, y) = 3y? + 2xy + 2x, py 7 tm Because 
<t = 2 the differential equation is exact. 


Method of solution. If a differential equation y’g(x, y) + A(x, y) = 0 is given, one first tests with the 
help of the condition of integrability whether it is exact. If this is the case, then there exists a correspond- 
ing function F(x, y). By solving F(x, y) = C for y one obtains the general integral of the differential 
equation. In the following it will be shown how to find the function F(x, y). 


General case Example: 
y’g(x, y) + A(x, y) = 0 y (6xy + x? + 3) + 3y? + 2xy + 2x =0 
OF A cat 
Be AY) a oy + 2xy + 2x 
F = f h(x, y) dx + 9(). F = 3y*x + x*y + x? + p(y). 


The result of the integration with respect to x is determined only to within a function » that is 
unknown for the time being and depends on »y alone. 


OF 6) ; 7 oF hes 
Se = Gy | Medex + 9°O) Jy = = ot P+ ON) 
OF 
By EY but = = 6xy + x? + 3 also, 
P00) = 8) — 5 | Hx, ») dy that is, 6xy + x? + 3 = 6xy +.x7 +. 9'()). 
This is a differential equation for 9(y). 
Because the condition of integrability is Clearly 
satisfied, one can prove that the right-hand | ¢'(y) =3 
side does not depend on . thus, pron pein ae 
oi) = | [e— [hax] ay, oy») = 3y +, constant, 
F(x, y) = J A(x, y) dx + oy). F(x, y) = 3y2x + x4y + x? + By. 


Hence the general integral is 


J A(x, vy) dx + oy) = C 3y?x + x2y + x? + 3y=C 


22.2. Elementarily integrable types 509 


Integrating factor method. If a differential equation of the form y’g(x, y) + A(x, y) = 0 is not 
exact, one follows a method proposed by EULER and multiplies it by a function u(x, y), which 
is chosen so that the equation becomes exact, in other words, that the lefthand side of 


y’a(x, y) u(x, y) + A(x, y) ux, y) = 0 
is a perfect derivative. Such a function p(x, y) is called an Euler multiplier or an integrating factor. 
Example: The differential equation y’(xy — x*) + y* — 3xy — 2x? = 0 is not exact, because 
as = y — 2x and = = 2y — 3x. However, the simple function u(x, y) = 2x is an integrating 
factor, because after multiplication by 2x one obtains 

y'(xy — x*) 2x + (y* — 3xy — 2x”) 2x =0, 
and now 


d 
By OY — x*) 2x = 4xy — 6x? and Sor (9? — Say — 2x2) De = Ay — 6x". 


By the above method, integration of this exact differential equation gives the general integral 
y*x? — 2x3y — x* = C. 


(gu) — 9(uth) 
Ox oy ” 


The condition for y’gu + hu to be a perfect derivative is clearly 
& se = (= = 5) This is a partial differential equation for the determination of 


op 
h By 
u(x, y). It appears that the problem of integration has only been made harder. However, since one 
needs only a single particular integral of this partial differential equation, a real advantage has 
nevertheless been achieved. It can even be shown that it always has an integral, that is, that always 
at least one integrating factor exists for the equation y’g + h = 0. 


Linear differential equations of higher order 


Linear differential equations of higher order occur frequently in applications. In the general 
linear differential equation of the mth order bo(x) y + b,(x) y’ + b2(x) yp” + °° + b,(x) Y= g(x), 
the coefficients b,(x) and the perturbation function g(x) are taken to be real continuous and bounded 
functions of x, and b,(x) is assumed not to vanish in the interval considered. Dividing by 5,(x) one 
then obtains the form ao(x) y + a,(x) y + a2(x) yy” +--» + y¥ = f(x), in which the a,(x) and 
f(x) are likewise continuous and bounded. If f(x) is identically equal to zero, the equation is said 
to be homogeneous; otherwise it is inhomogeneous. One begins by solving the homogeneous dif- 
ferential equation. This is achieved most easily when the coefficients a,(x) are constant numbers. 
The linear differential equation of the second order will serve as a pattern for linear differential 
equations of arbitrary order. 


Linear homogeneous differential equation of the second order ao(x) y + a,(x) y’ + y’’ = 0. One 
disregards the trivial solution y = 0. 

Because this differential equation is linear and homogeneous in y and its derivatives, if y,(x) 
and y2(x) are any two particular integrals, then C,y,(x) and C2y2(x) and every linear combination 
Ciy,(x) + C2y2(x) are also solutions, where C, and C, denote arbitrary constants. 

For a linear combination C,y,; + C2y2 of two particular solutions of the differential equation 
to represent the general integral, y, and y2 must be linearly independent. If they were linearly depen- 
dent, then two constants «, and «2 could be found, not both zero, for which «,y,; + «2y2 = 0. If 
x, +0, then yy = —(a2/0,) y2 = & yz and if a, +0, then yz = —(a,/x2) y; = &2y,. Thus, the 
two functions y, and yz would represent only the same particular integral, since one is just a multiple 
of the other. 


Example: y, = cos* x — cos 2x and y, = 1/2 sin* x are linearly dependent, because y, — 2y,=0 
for all x. But y; = x and y, = x?, or y,; = sin x and y; = cos x are /inearly independent. 


If two particular solutions y,(x) and y2(x) are linearly independent, they form a fundamental 
system for the differential equation. In this case the quotient y,/y2 is not constant, and consequently 


Wa _ Vi¥2 = Y2)1 


its derivative =| is not identically zero. The determinant 


dx \ y2 y2 
yon : ; 
y> y2 172 271 


is called the Wronskian determinant. The following theorem holds: 


510 22. Ordinary differential equations 


Two particular solutions y, and y, form a fundamental system and their linear combination 
y=C,y¥1 + Cry2 represents the general integral of the differential equation ag(x) y + a,(x) y +y" =0 
if and only if the Wronskian determinant formed from them is different from zero. 


Example: The linear differential equation xy” +- 2y’ + axy = 0, in which a denotes an arbitrary 
number, is transformed by the substitution «= xy into a differential equation with constant 
coefficients. By differentiation, from w= xy one obtains in succession u’ = y+ xy’, or 
y =u ‘/x — u/x* . and hence y” = (ux — u’)/x? — (u’ = 2u)/x*. Substituting the expressions 
for y, y’ and y” into the given equation, one obtains u” + au = 0. As will be shown later, for 
a =—1 one obtains u, = e* and uw, =—e™* as solutions, and it follows that y, = e*/x and 
y¥2 = e-*/x represent a fundamental system for the differential equation. 


For arbitrary coefficients a9(x) and a,(x) there is no general procedure of finding a fundamental 
system for the differential equation 


Ao(x) y + a(x) y + yy” = 


But there are reference works in which one can look up solutions or suitable methods of solution. 
However, if the coefficients are constant numbers, there is a method that is always successful in 
setting up a fundamental system. 


Linear homogeneous differential equations of the second order with constant coefficients. The 
differential equation has the form y”’ + cy’ A c2y = 0. By the substitution »(x) = e’*, so that 
y’ = re™, y” = r* e™*, the equation becomes (r? + cyr + cz) e’* = 0. Since the exponential function 
vanishes nowhere, the value of r can be determined from the quadratic equation. — 

If r; and rz are its roots, then y, = e™* and 
y2 =e" are particular solutions of the differential 
equation. The general integral is obtained as in one of the following cases. 


1. The roots r,; and rz are real and distinct. Then )y,/y2 = e~-"»)* is not constant, y, and y2 


are linearly independent, and y = C, e* + C,e":*, where C, and C, are arbitrary constants, 
represents the general integral. 


2. The characteristic equation has the repeated root r; = rz = —c,/2; y, and y2 are then linearly 
dependent. By substitution one finds that yz = x e”:~ also satisfies the differential equation y’” + c,y’ 
+ coy = 0. Because the quotient y2/y, = x is not constant, the particular solutions y, and y2 


form a fundamental system, and y = C, e* + Cy, xe™*, where C, and C2 are arbitrary constants, 
is the general integral. 


3. The roots r; and rz are complex. Because c, and cz are real by hypothesis, r,; and rz are con- 
jugate complex numbers: r; = « + if, r2 = « — if. The two particular solutions y, = e+! 
= e**(cos Bx + isin Bx) and y2 = e*—!f)* = e**(cos Bx — isin Bx) form a fundamental system, 
and the general integral is y* = y,C, + y2C2 or, if one substitutes C* = C, + C, and 
C¥# = i(C, — C2), y* = e** (CH cos Bx + C# sin Bx). Differential equations of this kind occur in 
oscillation problems. 

Example: The differential equation y’’ — 3y’ + 2y = O has the charac- O, 
teristic equation r? — 3r + 2 = 0 with the roots r, = | andr, = 2. It follows 
that its general integral is y = C, e* + C,e?*. 

Example: Mathematical pendulum. A pendulum, whose total mass m is 
assumed to be concentrated at the point A (Fig.), is suspended from a point 
O by a thread of length / = |OA|. It executes oscillations under the in- 
fluence of gravity, in which friction and further influences are neglected. 
If the angle between the pendulum and the vertical at the time ¢ is g, then 
the force mg acts vertically downwards on the mass m, so that the force 
mg sin ¢ acts in the direction of the tangent; g denotes the acceleration due to 
gravity. By Newton’s second law, this force is equal to the product of the 


mass m and the acceleration / - d*y thus, one obtains for the angle of 


dt > , 
inclination p(t) the differential equation 
6; ewe os Rate Sema 
ml qe = me sing, or az + sing =0. : 


This is not linear, and by separation of the variables leads to an elliptic 
integral, which can be evaluated by series expansion or from tables. By 


22.2-4 The mathematical pendulum 


22.2. Elementarily integrable types 511 


linearization, which is frequently applied, above all in physics, one obtains a linear differential 
equation that is far easier to solve. One limits oneself to small deviations p from the vertical, so that 
dp 8 
dr? l 


one can take sing @ and obtains *@~ = 0, with the solution 


g = «cos (wt + 4), 


where w = )(g//) is the angular frequency, hence t = 2n/m is the periodic time, and « andé 
denote the constants of integration. This formula for the solution expresses the fact, which was 
already noticed by GALILel, that the periodic time is independent of the magnitude of the 
oscillation. Of course, this holds only approximately; for oscillations of greater magnitude the 
periodic time is given by 


t = 20 V(I/g) [1 + (1/2)? sin? (Yo/2) + [(1 « 3)/(2 + 4)]? sin* (@o/2) + «+1, 


where go denotes the greatest inclination, the amplitude of the oscillation. Compared with this 
precise formula, which is obtained by solving the non-linearized differential equation, the error 
is only 0.002% for go = 1° and only 0.05% for go = 5°. 

Instead of the circle, a curve chosen in such a way that a body oscillating on it always has a 
periodic time independent of the magnitude of the oscillations is called a tautochrone. In 1673 
HuyvGens found that the cycloid has this property, and in the light of this he was able to con- 
struct a pendulum clock. The thread of the cycloidal pendulum constructed by him wrapped itself 
in oscillating around two cycloidal shaped surfaces. The p2ndulum bob then described a cycloid, 
because the evolute of a cycloid is also a cycloid. The pendulum oscillates tautochronously. 


Linear inhomogeneous differential equation of the second order @o(x) y + a,(x) y’ + y’”’ = f(x). 


The general solution of the inhomogeneous linear differential equation of the second order is equal 
to the sum of the general solution of the corresponding homogeneous differential equation and any 
particular solution of the inhomogeneous equation. 


Thus, if C,y,(x) + C2y2(x) is the general solution of the homogeneous differential equation and 
p(x) a particular integral of the inhomogeneous equation, then y = C,y,(x) + C2y2(x) + p(x) 
represents the general integral of the inhomogeneous differential equation. A particular integral 
p(x) of the inhomogeneous differential equation can be obtained from the general integral of the 
homogeneous equation by the method of variation of parameters due to LAGRANGE. In the ex- 
pression 


P(x) = Cy(X) yi (x) + C2(X) y2(x) 


the coefficients C, and C, are regarded as functions of x. Because two functions C,(x) and C(x) 
then have to be determined, they can be made to satisfy an additional condition; one chooses them 
so that Ci», + C3y¥2 = 0. Substituting for p(x), p’(x) and p’’(x) in the inhomogeneous differential 
equation and taking account of the assumption that y, and y, are solutions of the homogeneous 
equation, one obtains the equation , ri 

Ciy¥1 + Coy2 = 0, 


Ci¥s + Cr¥2 = fQ). Civ, + Cig =f) 


This, together with the additional condition, gives a system of equations for the determination of 
C; and C3; this system always has a solution, since the determinant of its coefficients, the Wronskian 
determinant, never vanishes because y, and y2 are linearly independent. C,(x) and C.(x) are obtained 
by integration, and hence the general integral of the inhomogeneous differential equation. However, 
in practice it is usually quite awkward to carry out the procedure of variation of the parameters, 
above all because it leads, in general, to integrals that cannot be evaluated in closed form. 


Example: For the differential equation xy’ + 2y’ — xy = e* the functions y, = e*/x and 
yz = e-*/x form a fundamental system for the homogeneous equation. Variation of the parameters 
requires a certain amount of calculation and yields p(x) = '/,¢* as a particular integral of the in- 
homogeneous equation. Hence the general solution is 


y(x) = C, &/x + Cz, e°-*/x + "/2e*. 


Particular solutions of the inhomogeneous linear differential equation of the second order with 
constant coefficients for special perturbation functions. If the coefficients c, and c. are constants, 
a particular integral of the differential equation y” + c,y’ + c2y = f(x) can be found for certain 
types of perturbation function f(x), without using the method of variation of the parameters. 

Type 1: If the perturbation function is a polynomial, f(x) = ao + a,x + anx? + +++ + a,x", 
a, +0, then one sets p(x) = bo + bix + +++ + by_yx""! + b,x" if c. +0; if cz = 0, however, 
one also introduces into the expression the term 5,,,x"*!. The required coefficients bp, 6,,.-. are 
obtained by equating coefficients. 


512 22. Ordinary differential equations 


Example: y “+ y = x*, in which ao = 0, a; = 0, 42 = 1, Cz = 1. One sets 
P(x) = bo + “Bax +- b3x?, so that p’ = b, + 2b2x, p” = 26. Substitution in the equation gives 
(bp + 262) + byx + box? =x’, and equating coefficients, bb) = —2, b; =0, b, = 1. Hence 
p(x) = —24+ x7 isa particular integral and y = C, cos x + C, sin x +- x* — 2 is the general 
solution of the differential equation. 


Type 2: If the perturbation function is an exponential function, f(x) = a e**, one makes the sub- 
stitution p(x) = b e**, which contains only those exponential functions that occur in the perturbation 
function. The value of 5 is to be determined. 


Example: y’’ + y = 2e**, in which a = 2, k = 3. Set p(x) = 5 e**. By substitution one obtains 
the equation 95 + b= 2 for the determination of 6, and hence the particular integral p(x) = '/.,e°* 
and the general integral y = */<e°* + C, cos x + C, sin x. 


Type 3: If the perturbation function is a trigonometric function f(x) = acos mx + bsin mx, 
one sets p(x) = a* cos mx + 5* sin mx, into which both the cosine and the sine function enter, 
even when only one of these functions occurs in the perturbation function, that is, if a or 5 is zero. 
Here a* and b* are determined by equating coefficients. 


Example: y” + y = 2 sin 3x, in which a = 0, 6b = 2, m= 3. ce = a* cos 3x + b* sin 3x. 
Substituting and equating coefficients one obtains a* = 0, b* = It follows that the general 
integral is y = —*/, sin 3x + C, cos x + C; sin x. 


Type 4: If the perturbation function is a linear combination of functions that occur in types 1, 2 
and 3, then the particular solution is a linear combination of the corresponding individual particular 
solutions. In other words, to the general solution of the homogeneous equation must be added in 
succession the particular solutions obtained in turn by ignoring all but one of the terms of the per- 
turbation function. 


Example: y"’ + y = x? + 2e3* + 2 sin 3x. It follows from the previous examples that the general 
integral is y = C, cos x + Cz sin x + x* — 2+ */se°* — '/4 sin 3x. 


All these procedures for determining particular solutions of the inhomogeneous differential 
equation fail in the case of resonance, that is, if the perturbation function or one of its terms is at 
the same time an integral of the homogeneous differential equation. Resonance occurs, for example, 
in the differential equations y’’ + y = cos x and y” — y’ = e*. 


22.3. Further considerations 


Integration procedures in practice 


As already emphasized, it is exceptional when a differential equation is elementarily integrable. 
But there are also methods for finding integrals in the most difficult cases, at least approximately, 
for example, by approximating to the differential equation by means of a difference equation and 
applying the methods of the calculus of difference equations. In the sequel certain other important 
procedures will be described. 


Integration by means of power series. One expresses the required solution y = y(x) of the dif- 
ferential equation y’ = f(x, y) in the form of a power series y = dg + a,x + a2x? arn with 
coefficients a, that are initially undetermined, and substitutes for y and its derivatives in the dif- 
ferential equation. Under certain conditions the coefficients a, of the series can then be calculated 
by aan coefficients. 


ample: y’ = x? + y. Set y = ao + ayx + 2x? + asx? + agx* +--+,» = a, + 2a2x + 3a3x? 
aor +: it follows that a, + 2a,x + 3a3x* + 4agx? + --- = ag + ayx + (a, + 1) x? 
+ a3x* + or +. Equating coefficients yields a, =a, 42 —dpo/2, a3 = (ao + 2)/6, 
a4 = (do + 2)/24, ... Hence one obtains as the integral y = dg + doX + agx?/2 + (ao + 2) x7/6 
+ (ao + 2) x4/24 + -- and from this by 1 ment ye 
+ (ao + 2)x2/2! + (ao + 2) x3/3! ++ -—2—2x—, OF y= (ay + 2)e* —2 —2x— 


In this particular example the final rearrangement has led to a closed form for the integral. By 
differentiation one can show that this function does indeed satisfy the differental equation. If no 
closed form can be found, then the series is terminated at a suitable place according to the accuracy 
required. The convergence of the series obtained is a consequence of the following theorem, stated 
without proof. The theorem is applicable because the right-hand side of the differential equation 
is a polynomial in x and y. 


22.3. Further considerations 513 


The solution of the differential equation y’ = f(x, y) can always be represented by a convergent 
power series Je the right-hand side Agicu can be expanded as a power series f(x, Yy) = Coo + C10X 
+ Cory + C2oX? + C11 XY + Cony? + °° = J ¢j,x*y" that is absolutely convergent in a certain 
domain of the x, y-plane. 

The method of solution by power series can also be applied to differential equations of higher 
order, as shown by the following two important differential equations. 


Gaussian differential equation, hypergeometric series. In 1812 Gauss studied thoroughly a parti- 
cular differential equation, which he called hypergeometric. It contains several parameters to which 
arbitrary values can be assigned, and it can therefore readily be made to fit special conditions in 
applications. It is 

x(x — 1) y” + [0 +8+4+ 1) x—y]y + afy =0, 
where «, f, y are the parameters. For y + 0, —1, —2, ... it has a power series solution that can be 
obtained by taking y = Py a,x*, Let it be stated here without proof that one obtains the recursion 
formula (k + 1) (kK + a 41 = (kK + x) (kK + B) a, and hence finally the series 

B a(x+1) BIB +1) 
y=a hee ee ge ee : | =a F(a, Pp, y, x 
o| I! y 2! yy) + eae) 

with radius of convergence 1. The series F(x, B, y, x) in the brackets, which depends on the 3 para- 
meters and the variable x, is called a hypergeometric series. It includes many functions well known 
in analysis, which arise by special choices of the parameters; for example, 

Fd, £, B, x) = 1/(1 — x), the geometric series; 

F(—n, B,B, —x) = (1+ x)"; xF(1, 1,2, —x) = In (1 + x); 

lim FU, 6,1,x)=e*; lim xFlx, B, 3/2, —x?/(4«B)] = sin x. 


B-+0o a,B—+ 00 
In the general case, however, F(«, 8, y, x) cannot be expressed in terms of finitely many elementary 
functions. 


Bessel’s differential equation, cylinder functions. Following the studies of D. BERNOULLI and 
EULER, BESSEL investigated a special differential equation of the second order, which occurs in 
many problems of physics and technology, particularly those concerned with oscillations. It is the 
differential equation 

xy’ +(+ny —y=0, nconstant. 
a, 


(A+ 1I)(A+1+2%) 


= dojn(x). 


oO 
The power series expansion y = )’ a,x* yields here ay, = 
k=0 


x x? 


vd+n * 2a+maQtn 
The series /,(x) depending on n (appearing in the brackets) are always convergent for n+ —1, 
n=+ —2,... They are called Bessel functions (cylinder functions) of the first kind. 


Graphical methods of integration. For the graphical solution of differential equations numerous 
integration procedures have been deve- 
loped that are suitable forspecialtypes | | 
of equation and meet the required “"# © = Pir y) 


: 3 

degree of accuracy. There is space ist En Hk 7 Es r 
here for only basic considerations and _ Picea te aed pleat bt ae 
bebsed feed Hast 

Se be eee 


and hence y = dp (1 + 


I i] i 7 J 
t | | 1 
| 

bs 


even then only for differential equa- 
tions of the first order. 


Method of polygonal arcs. If one 
wants to draw an integral curve of 
the differential equation y’ = f(x, y) 
that passes through a fixed point 
Po(Xo, Yo), and consequently satisfies 
an initial condition, then yo = f (xo, Yo) 
gives the direction of the tangent to 
the required curve at this point. On 
the tangent a point P,(x;, y;) is taken 
at acertain distance from Po and the 
procedure is repeated; the smaller the 
distance chosen, the more accurate the 22.3-1 Graphical integration of the differential equation 
solution curve will be. So one obtains y’ = —y/x by the method of polygonal arcs 


te 
oa Ht 


“4 Bes ey | 
- foo . , 
eee ba - I 


514 22. Ordinary differential equations 


an approximation for the integral curve. The figure shows the procedure for the differential equation 
y’ = —y/x with the initial conditions x9 = 2, yo = 6. It also shows points marked by red circles 
that lie on the true integral curve. The deviation is considerable. Essentially more accurate is the 
method of interpolated half steps (Fig.). In this method the direction yg = f(xo, Yo) calculated for 
the point Po(xo, Yo) is used only to calculate the direction 7; = f(€,, 7) for an intermediate point 
JT7,(E1, 1) (half step). The first whole step from Po to the first point P,(x,, 1) of the polygon is 
taken in this direction 7; . The next half step leads from P, with y,; = f(x,, y,) to 12(€2, 2) and 
the next whole step from P, to P2(x2, y2) with n, = f(&2, 2). AS can be seen by comparing the 
two polygonal arcs, the approximation to the true integral curve marked by red circles is signifi- 
cantly better in the second method. It could be further improved by reducing the size of the steps, 
above all in those domains in which the curve changes direction rapidly. 


: : _— 22.3-3 The special integral curve selected to take 
‘Ol | | 4 Senin. Bat ae account of the initial condition (x») = ye 


22.3-2 Graphical solution of the differential equation y’ = — = by the method of interpolated half-steps 
with the initial condition x, = 2, ye = 6. The integral curve is shown in red 


Glances at the theory 


Initial value problem, boundary value problem. The picture of the general integral of a differential 
equation of the second order is a two-parameter family of curves. In the case of an application, the 
integral curve of a special solution (that is, of a particular integral) must be extracted from these 
taking care of the special circumstances. For example, if g denotes the acceleration due to gravity, 


then ) = g is the differential equation of free fall. By integration one obtains o =gt+C,, 


y = 1/,gt? + Cyt+C,. But one wants to know where the falling body will be after a time ¢ if 
it was at a height yo at the beginning of the fall, that is, at time tg = 0, and if its initial speed was 
Vo. The constants of integration C, and C, can be determined from these initial conditions. From 
y’ = gt + C, the equation v9 = C, is obtained for t = fg = 0, and from y = '/2gt? + Cyt+ Cz 
the equation yp = C,. Hence the particular integral required is y = !/2gt? + vot + yo. 

A differential equation of the first order y’ = f(x, y) has the general integral y = (x, C), whose 
picture is a one-parameter family of curves. If the initial condition yo = y(xo, C) is prescribed, 
the constant C can be determined from this equation, C = (Xo, Yo). The required particular in- 
tegral y = Ix; y(Xo, Yo)] = P(x; Xo, Yo) depends on the initial conditions (Fig.). For physical 
and technological investigations it is important that ®(x; x9, yo) shall be a continuous function 
of the initial values, which in practice are known only approximately. Consequently, in the theory 
of differential equations the question is examined: what requirements must be placed on the right- 
hand side of the equation y’ = f(x, y) in order that y = ®(x; x9, Yo) is a continuous function of 
the initial conditions. Together with the continuity of the function f(x, y), a sufficient condition 
for this (and also for the existence and uniqueness theorem) is that a Lipschitz condition is satisfied 
(see later). 

For a differential equation of the second order the initial conditions (xo) = Yo and y(xo) = Yo 
are prescribed; this requires that the integral curve passes through the point (x9, yo) and that the 
tangent at that point has the prescribed direction yg. In general, the initial conditions of a differential 
equation of the nth order require that the integral curve passes through a point (xo, yo) and that at 
this point the first (2 — 1) derivatives assume prescribed values. For differential equations of higher 
order also the continuous dependence of the solutions on the initial conditions must be investigated. 


22.3. Further considerations 515 


For differential equations of order higher than the first, not only initial conditions, but also 
boundary values can be prescribed. Quite new types of problems then occur. Suppose, for example, 
that the solution y(x) of the differential equation y’’ + Ay = 0 in the interval 0 < x < zis required. 
It deals with an oscillation problem; one can think of a string stretched between the points 0 and z. 
At these points the string cannot be displaced from the position of rest; one must impose the boundary 
conditions (of the first kind) yO) = 0, »(x) = 0. From the solution y(x) = C, sin (x YA + C,) one 
obtains the system of equations 


y(0) = C; sin C> = 0, y(z2) = C; sin (x yA + C2) = 0. 


If V/A is not an integer, it follows that C,; = C, = 0, and then the solution is y= 0. The string 
would zot oscillate, but would remain at rest. If, however, |/A is an integer, it follows that A is a 
perfect square, A = 1, 4, 9, ..., the system of equations is satisfied, and one obtains infinitely many 
solutions y(x) = C, sin (x VA). The string moves between the boundaries in the sinusoidal oscil- 
lations y = C, sin x, y = C, sin 2x, y = C, sin 3x, ...; these are the fundamental oscillation and 
the possible higher harmonics for the string, in which it can oscillate without external influences, 
once it is disturbed. These oscillations, which can be superimposed and can again be separated by 
harmonic analysis, are therefore called characteristic (or eigen-) oscillations and the numbers A, = 1, 
A, = 4,43 = 9, ..., A, = nn’, ... are called the eigenvalues of this problem. 

In boundary value problems of the second kind the values at the boundary points x, and x, of the 
derivatives y’(x,) = a, y(x2) = 5 are prescribed. Theory and practice require the determination 
of eigenfunctions and eigenvalues. It is characteristic of these problems that non-trivial solutions 
y + 0 exist only for certain discrete values of a parameter, namely, for the eigenvalues. 


Existence theorem, uniqueness theorem. The geometrical interpretation of a differential equation 


of the first kind y’ = f(x, y) by means of the direction field suggests the validity of the existence 
theorem: 


If f(x, y) is a continuous function of both variables, then an integral curve passes through every 
point (x9, Yo) of its domain of continuity D. 


This theorem was, in fact, proved by PEANO. The question of the existence of an integral, in ge- 
neral, was first raised in the years 1820 to 1830 by Caucny and was also proved by him under the 
assumptions that f(x, y) is continuous and has a continuous partial derivative f,(x, y). One reco- 
gnizes that the existence theorem of Peano, in comparison with that of Cauchy, represents an es- 
sential sharpening, since the existence of the integral can be deduced from weaker assumptions. 

It is for theory and practice an equally important question, under what assumptions about the 
right-hand side of the differential equation y’ = f(x, y) can the uniqueness of the solution also be 
deduced. Geometrically expressed, the uniqueness would be violated, for example, if the integral 
curve were to branch at a point (x9, ¥o). Physically the circumstance that, in spite of the same 
initial conditions, the process continued in different ways, would correspond to a violation of the 
principle of causality. 

Both the existence and the uniqueness of the solution are guaranteed when the continuous function 
f(x, y) satisfies a so-called Lipschitz condition in a domain D of the x, y-plane. A function f(t) of 
a single variable ¢ is said to satisfy a Lipschitz condition with a constant L in an interval [a, 5] 
when its difference quotients are bounded by L: |f(t;) — f(t2)| < L|t, — t,| for all t,, t2 in fa, 4]. 
In the present situation f(x, y) is a function of two variables x and y, but x is to be treated as a 
constant parameter. 


Geometrically this states that on every ordinate f(x, y) has a bounded difference quotient with respect 
to y. Consequently the Lipschitz condition is certainly satisfied if the partial derivative f,(x, y) is 
bounded. With this concept the existence and uniqueness theorem has the form: 


Suppose that the function /(x, y) is continuous and satisfies a Lipschitz condition in a domain D. 
Then for every point (xo, ¥o) in D there is exactly one integral curve y = (x) passing through 
(Xa. Va ) 


The proof is based on the method of iteration, by which a sequence of successive approximations 
y,(x) arises if one begins with the function %o(x) = Yo and defines the approximations by 


Pni1(X) = Yo + SFC, y,(t)) dt, a=0,1,2,... 


It can be shown, by making essential use of the Lipschitz condition, that the approximations 9,(x) 
converge uniformly to the solution y = g(x): lim g,(x) = 9(x). 
a—- C 


516 22. Ordinary differential equations 


This proof is even constructive; it allows one in practice to obtain a solution, for example, that of 
the differential equation y’ = yx with the initial conditions x9 = 0, yo = 1, although this is easier 
to solve by another method. One obtains in succession: 


Yo = 1, 
* 2 * 3 2 4 
m@)=1+fxde=1t3, e=14+[(x+7-)ara1+ 54+ 55. 
2 x* x?" 2 2/9\2 2 
—_ cA a Pt... Aly | 
Pal) = VA + a pg 1+ OD + BPE + DM, 


and finally, 
(x) = lim ¢,(x) =5 (x?/2)t/k! = e*"/?, 


The question of conditions that ensure the existence and uniqueness of the integral of a dif- 
ferential equation is more difficult for the implicit differential equation of the first order F(x, y, y’) = 0. 
It is still more difficult to answer generally for differential equations of higher order. 

If the conditions of the existence and uniqueness theorems are not satisfied at a point (x9, yo), 
then there may be several or even infinitely many integral curves passing through (x9, yo), or none 
at all. Such a point is called a singular point of the differential equation (Fig.). It is of theoretical 
interest that the integral curves in its neighbourhood can behave in quite an unusual way. But a 
singular point is also important for the physicist or technologist. It characterizes mathematically 
a transition point of the physical event and the critical technical data: the fracture of a beam under 
bending, the breaking of a rope under tension, a change of state etc. 


22.3-4 The behaviour of the integral 
e curves in the neighbourhood of a singu- 
GH—-N lar point; a) nodes, b) whirl, c) vortex, 
a b Cc d 


d) saddle point 


Differential equations of higher order and systems of differential equations. The differential equation 

of the mth order 

F(X, Ys, Y's oy YOY, YM) = 0 
can always be written as a system of ” differential equations of the first order, by introducing new 
functions yy = y’, V2 = yy «+s Yn_1 = YY. One obtains the system y; = »’, v2 = ¥1, 3 = Vao- 
F(X, Y. Vis * *»Yn—19 Yn-1 ) = 0. 

Similarly a system of differential equations of higher order can also always be written as a system 
of differential equations of the first order. Consequently the proof of the existence and uniqueness 
of the integral of differential equations of higher order can also be reduced to that of the integral 
of a system of differential equations of the first order and made considerably easier in this way. 

Applications in mechanics. These ideas are of great significance in mechanics. The fundamental 
concept of acceleration is expressed mathematically by the second derivative with respect to the 
time ¢ of the space coordinates x(t), y(t), z(t) of the point mass. It follows that in order to determine 
the motion of a point of mass m one must integrate the system of equations 

2 


d?x mi d?z 
m ar = POY, 2), a2 = A(x, y, z), ma = ROY, z), 


where P, Q and R, depending on the space coordinates (x, y, z) of the point, denote the components 
of the force acting on the point mass. If one reduces this system to a system of differential equations 
of the first order, one obtains a system of 6 differential equations 


u=xX, v=y, w=2, mi= P(x,y,z), mbib=—Q,y,z), mw = R(X, y, 2), 
in which the new functions #, v, w denote the velocities. Systems of differential equations occur most 
frequently in which the number of functions required agrees with the number of differential equations. 


In general, a system of n differential equations for n functions of the variable ¢, in which the equations 
are solved for the derivatives, has the form 


(dx,)/(dt) = f(t, X19 Xs 00% Xn); i= l, 2, coe M 
A solution or integral of this system is a system x,(t), x2(t), ---, X,(¢) which when substituted in the 
given system makes each individual equation an identity in t. The real difficulty of integration lies, 


as one can see, in the fact that one cannot integrate one differential equation after another, because 
one required function x, is also contained in the right-hand side of another differential equation. 


23.1. Differentiation and integration of complex-valued functions 517 


Physically speaking, the individual motions expressed by the x,(t) influence one 
another; the physicist speaks of coupling. One can think, for example, of two 
pendulums oscillating, not independently of one another, but coupled by means 
of a spring attached to each of the pendulum rods (Fig.). 

To simplify the statement of the problem and to help the intuition one prefers 
to use the concepts of multidimensional geometry in the theory of differential 
equations. If the integral of a differential equation of the first order: is repre- 
sented geometrically by an integral curve in the x, y-plane, then the integrals re oye are 
of the system (dx,)/(dt) = fi, i= 1, 2, ...,n, can be regarded as integral curves sha ee rte ae 
in m-dimensional space, in which the x,(t), i= 1, 2, ...,, describe the coordi- a a system of 
nates of the motion of a point. differential equa- 

The theory of first integrals. With the help of this nomenclature one can _ tions 
sketch the theory of first integrals, which is also of special importance for 
physics. An equation F(x,, ..-, X,) = C, where C is constant, defines in the n-dimensional coordi- 
nate space of the x; an (m — 1)-dimensional hypersurface, and for variable C a one-parameter 
family of hypersurfaces. If » — 1 families of hypersurfaces 


Fi(xi,--,X)J= Ci t= 1,2,...,.n2-1 


are given, and from each family one hypersurface is selected, then, in general, these will intersect 
in a curve of the n-dimensional space. Altogether in this way a family of curves is obtained that 
depends on the nm — 1 parameters C,, i = 1, 2, ....1 — 1. When does this family represent the integral 
curves of the system (dx,)/(dt)=/,, i= 1,2, ...,? — For this every integral curve of the family must 
lie completely in a particular hypersurface of each family of hypersurfaces; thus, each F;(x,, ---, Xn) 
must be constant. Every function F(x,, ..., X,) that is constant along all the integral curves of the 
system is called a first integral of this system. For a function F(x,, ..., x,) having a total differential 
to be a first integral, it is necessary and sufficient that 


OF OF OF 
Ox, ft a Ox, 2 sata ox, 7" = 0. 
All the functions F;, i= 1, 2,...,2— 1, must satisfy this condition; hence one has to solve the 


system of equations 


OF; : 
——f,=90, i=1,2,...,”—1. 


dx F : ; : ‘ 
The f, = Ti are known. Consequently it is possible, in general, to integrate the system with a 


knowledge of m — 1 first integrals. If not all m — 1 first integrals are known, but only m<_n— 1 
of them, then an (7 — m)-dimensional manifold is determined by the fixed 

Fi(X1, X25 +++) Xn) = Cy; i= 1,2,...,m. 
The integral curves lying in this manifold are to be determined. One then has a system of n — 1— m 
differential equations of the first order; a knowledge of m first integrals reduces the number of 
equations of the system by m. 

This possibility is of great significance in mechanics, for example in celestial mechanics. The 
celebrated three body problem investigates the motion of three mutually attracting masses, for 
example, the sun and two planets. It leads to 18 differential equations with 18 unknown functions, 
namely 9 coordinate functions and 9 velocity components. With the help of 12 known first integrals 
this problem is reduced to the integration of 6 differential equations of the first order. 


23. Complex analysis 


23.1. Differentiation and integration of 23.3. The total course of complex-valued 
complex-valued functions ......... 517 fUNCHONS: o.o:55 5 Sern es Sino ea cnecews 526 
23.2. Applications of complex analysis... 522 23.4. Elliptic integrals ................. 528 


23.1. Differentiation and integration of complex-valued functions 


Complex-valued functions. Two real-valued functions u and v defined in a set M of the x, y-plane 
assign to every point (x, y)€ M a point (u, v) of the u, v-plane. If each of these points (x, y) and 
(u, v) is regarded as a complex number z = x + iy and w = u + iv, then to every complex number 


518 23. Complex analysis 


zéM there corresponds a complex number w = f(z) = u(x, y) + iv(x, y). This correspondence 
represents a complex-valued function f (Fig.). It is called continuous at a point Zz) € M if for every 
sequence {z,} with z,¢M that converges to Zo for n = 1,2,... the sequence f(z,) converges to 
f (Zo). Here a sequence {z,}, m = 1, 2, ..., of complex numbers is said to converge if the sequences 
{Re z,} of real parts and {Im z,} of imaginary parts converge. But this means that f is continuous 
at Zo = Xo + iyo if and only if u and v are continuous at (Xo, yo). A function defined in M is called 
continuous on M if it is continuous at every point of M. 


v 23.1-1 Correspondence between 
points z= x + iy and points 
e w = u-+ iv under a complex- 
w=F/z) valued function w = f(z) 
=u(x, vy) +(x, } 
il uf 23.1-2 Subdivision of a curve y tt 
represented by z(t) = x(t) + iy(t) 


Complex curvilinear integrals. A continuous curve in the z-plane is 
a point set y that can be represented in the form z = z(t) = x(t) + i(y4), 
where a < ¢t < b and x(t), y(t) are continuous real-valued functions of tf. on 
A curve is called continuously differentiable if the functions x(t) and y(t) t 
have continuous first derivatives; the curve then has a finite length /(y) (see Chapter 20.3. — Arc length 
and surface area). Suppose now that the interval [a, b] is divided into k& subintervals [a, ¢,], [¢,, f2], ---, 
[t-1, b] by points t;, j= 0,1,..., k, fo =a, = b, to which on the curve there correspond the 
points z; = 2(t,) (Fig.). If the set is contained in the domain of definition M of = complex-valued 


function f that is continuous on M, then for distinguished subdivisions the sums 2 Ff (Zs) (2 — 2-1) 
converge to a complex number, which is called the complex curvilinear integral iF f (z) dz of f along 


y. Here a sequence of subdivisions is called distinguished if the lengths of the longest subintervals 
form a null sequence for k > oo. The limit is independent of the choice of the distinguished sequence 
of subdivisions. By f = u + iv and z, = x(t,) + iy(t,) one obtains for the products in the sum 
f(z) (zy — Zy1) = Ulxy, Ys) XY — Xj-1) — (Xj, ¥y) (Yy — Yy-1) + iLO, yy) (xy — Xy_1) 
+ u(xy, ¥)) Oy — Yy-1)] 
and according to the definition of a real curvilinear integral of the second kind (see Chapter 20.3. - 
Line and surface integrals) 


LNG Gee ae eee) 
y y 


=S(ri od) fOa +ea)e 


t=a i=a 


b 
because, for example, J Wey d= | u(x(t), y(t) ot ay. By the Aehnition, 2 ep 


dt dt 
0 


t=a 


oO 


one finally seins J f(z) dz = J f(z(t)) dt. Let Z = « — if be the complex number 


conjugate to z=a-+ if; this means that a =(z+ Zz)/2 and B= (z — Z)/(2i). Correspondingly 


one obtains from sums Py f(z;) (Z — Zj_,) the curvilinear integral ! f(z) dz = f f(z(t)) (= 20) dt. 


Example: The circle y with centre at z= 0 and radius r = 1 Fig) traversed in the positive 
sense can be represented by z(f) = cos / +-isin¢ with 0 < ¢ < 2x. For this be Berita one 


am ii 


ie are we enn 


L ae a(t) 5 rele ee an 
t —sint+icost , Aik 
ia ~ [ay ae Src 
rm=Q 10 
deicsisisnnca't tule tien 4 Seca. 


23.1-3 The integral Sa = 2-1, where the unit circle y is traversed in the positive 
sense 
7 


23.1. Differentiation and integration of complex-valued functions 519 


Complex partial differentiations. Two real-valued functions u and v defined in an open set M 
of the z-plane and having continuous partial derivatives of the first order can be linearized at 
(xo, Yo) € M, that is, approximated by polynomials of the first degree: 


Ou(x ’ Ou(x 3 
U(x, y) = u(xo, Yo) + SEOs Yo) (x — Xo) + Bul%o. Yo) (y — Yo) 
Ox oy 
dv(Xo, Yo) 
Ox Co 
Consequently f = u + iv can be linearized at zo = x9 + iyo by 


du(Xo ) Yo) 


and D(x, y) = (Xo, Yo) + — Xo) + a (y — yo). 


f(z) = fo) + ee aN j Seto. Yo) ua) (x — Xo) 
du(Xo, Yo) . d0(xo, Yo) 
|e oe (y — Yo). 


After substituting (x — xo) = [(z — Zo) + (z — Zo)]/2 and y — yo = [C2 — 20) — (2 — Zo)])/(2i) 
and observing the relations oe + 2 = Of and oe + oe = of this linearization becomes 
Ox Ox Ox Oy Oy oy 

af (zo) _ ; eo) af(zo) , . af(zo)] -—— 
Fe) = f(e0) + > | — 1 | Ge — 20) + > | + I | 77. 
Starting out from this one defines the ee partial derivatives of the first order of f relative to 
z and Z at the point zy: 
Of (Zo) _ i df (Zo) 34 Of (Zo) and Of(Zo) _ = 542 Of (Zo) +i 2) 
oz——iéi‘ oy Oz 21 ax oy |- 
For these differentiations the standard rules for real-valued functions are valid, for example, 


ote) g(z)) _ f(z) 4 dg(z) a oLf(z) : g(z)] = f(z): <e@) oe SG) >. 
4 OZ Oz 0Z 
af(z | 
Examples: 1. For f(z) = const it follows that — = 0 and fe = (0, hence fe = 0 
Ben fe) 0. | 
2. Rd f(z) =z=x-+ iy it follows that of(z) = 1 and (2) = i, hence a) = 1] and 
af(z) _ ox oy z 
oF a he : ai 
3. For f(z) = 7 = x — iy it follows that ee) a la ae = —i, hence A = 0 and 
df (z) y z 
= ] 
Or ane = 
4. The rule for the differentiation of a product applied to naa Bis ve z" ae by induction 
dz? dz* 2 dz" ' dz Zz 
—_ = === Se = Vy ess “SS 0. 
aa = 22, ae sf", az nz"~", as well as Y; 0, — a = 0, . ry; 


5. For a polynomial f' a = do + a,z + azz? + --- + a,2" with constant coefficients a, it follows 
that ——— _— = (0+ a, + 2a,z + ++» + na,z"~' and — = 0. 

The sequence {s,(z)} of partial sums s,(z) = Zz a,(z — zo)’ of the power series iz a,(z — zo) 
converges, as m — oo, either at z = Zo only, oe tiva cheealan dlc lz — zol < R, or in the whole plane. 
The limit function f satisfies ——— we = = Sia, (z — z)/-' and —— oo =0 -~ Chapter 21.). A spe- 
cial case is the definition of oy icpeiiailet exponential ducted es r= zy —; its circle of con- 


imo J! 
vergence is the whole z-plane, and its partial derivative is oe = Xp rz. 


The exponential function satisfies Euler’s formula exp (ip) = cosy + ising. The functional 
equation exp (z,; + Z2) = exp Zz; ‘exp Zz, Or EXP Zo = exp (Zg — z)° exp z results from 
~ [exp (Zo — z) + exp z] = exp (z — z) exp z + exp (Zo — z) expz = 0, 
because this shows that exp (z) — z) exp z = const = exp Zo. 
of(z) a, 
Oz 


d 
for every point zé M. For holomorphic functions one writes ap or f’ instead of ar" As already 


Holomorphic functions. A function / defined in an open set M is called holomorphic if 


520 23. Complex analysis 


mentioned, the limit function of a power series is holomorphic. A 
domain D is an open point set in which any two points can be con- 
nected by a curve lying entirely in D. In a simply-connected domain 
D two such curves with the same initial point and the same end-point can 
always be deformed into each other continuously so that the domain D 
is never left (Fig.). 


23.1-4 a) Simply-connected, b) multiply-connected domain D 


For every holomorphic function f defined in a simply-connected domain D there exists one and, 
dF(z) 


dz 


apart from an additive constant, only one holomorphic primitive function F for which f(z) = 
Along any curve y leading from z, to z, in D one has [ f(z) dz = F(z2) — F(z;). 
ry 


Example: For f(z) = z* a primitive Fis F(z) = (1/3) z*. If y joins the point z; = 1 to zz = 2 +i, 
n 
J 2? dz = (1/3) (2 + i)? — (1/3) 1° = */3 + Jai. 
¥ 
Let y be the boundary of a part B of D. Then the theorem of Ostrogradskii-Gauss holds: 


Jwartoa= JJ (ar =) ax dy. 
Y 


From this theorem it follows that J f(z) dz = Ju dx — vdy) + iG dy + v dx) 


== [f(Qe +f) scor [JQ ee 
= fife ie) (BH 4 528 57) | ax ay =i[flart igy]ere = [[7Gp ara 


of (z) 
OZ 


Since for holomorphic functions 


= 0, it follows that f f(z) dz = 0. This is a way of proving 
the Cauchy integral theorem. Y 


Cauchy's integral theorem: If yo is a closed curve in a simply-connected domain D, then 
{ f(z) dz = 0 for every function f holomorphic in D. 
Fe 


Cauchy’s integral theorem also follows from the theorem of the existence of a primitive function 
F(z) for f in a simply-connected domain. For if y is closed, then the initial point z, is the same as 
the end-point z,, and therefore J f(z) dz = F(z.) — F(z,) = 0. 


The values of a holomorphic | function fin the interior 
of a circular disc |z — z9| < R entirely contained in M 
are determined, according to Cauchy’s integral formula, 
by the values f(¢) which f assumes on the boundary 7 
of this disc, traversed in the positive direction (Fig.). For 
by Cauchy’s integral theorem 


FO ©) i f ©) 
d¢=0, 
a = 
You Yes 

along the closed curves 79; and 92. If one adds the 
integrals and lets the radius 9 tend to 
zero, one obtains Cauchy’s integral 
formula, using the example on com- 
plex curvilinear integrals (on page 
518). 

A function f holomorphic in the disc lz — Z| << R can be pepreenied as limit function by a 


dé =0 


uniquely determined power series f(z) = py a,(z — 29)/ with the circle of convergence R and has the 


derivative z) = x jaz — Zo)4~ 1 . which is also holomorphic. By repeated differentiation one 


dz j=l 
obtains holomorphic derivatives of every order k, which can also be obtained from Cauchy’s integral 
formula by differentiation. Setting z = z) one can determine by comparison of the two results the 


coefficients a, of the series for f(z), beginning with ag = f(Zo). 


23.1. Differentiation and integration of complex-valued functions 521 


A function f holomorphic in |z — zo|< R and uniquely represented by the power series 


oo 3 a* k! d 
f(z) = 2 a,(z — zo)’ has holomorphic derivatives ae Fai Oe of every order k, 


dz* 
1 f) 
and the coefficients of the power series are determined by a, = aor | —Sre where yo 


¥ 
is a curve in the interior of the disc of convergence that goes around z or Zo once in the positive direc- 


Isolated singularities of holomorphic functions. Let M be an open set con- 
taining the punctured disc 0 < |z — Zo| < R (Fig.), but not necessarily the 
centre Zo of the disc. Let f be a function holomorphic in M. Then f can be 
represented in the disc by a uniquely determined Laurent expansion, whose 
principal part consists of the terms with negative powers of the coefficients. 
The coefficient of its first term a_, = Res fis called the residue of fat z). The 
point Zo is called an isolated singularity of f. 


WY, zy) =2 
23.1-6 Laurent ex- 


pansion in a punctu- 
red open disc 


If aj; = 0 for all negative /, that is, if the principal part is zero, then the Laurent series reduces 
to a power series, and f is holomorphic in the entire disc |z — zo| < R if one sets f(Z) = do. In 
this case Zo is called a removable singularity of f. It is called a pole of order n of f if a_, + 0, but 
Q_n_1 = 0, G_n_2 = 0,7 ..., that is, if only finitely many a, with negative j differ from zero. Then 
|f(z)| becomes arbitrarily large for places z sufficiently near to z). An essential singularity zo of f 
occurs if a, + 0 for infinitely many negative 7. By a theorem of Casorati-Weierstrass f then comes 
arbitrarily near to every complex value in every neighbourhood of Zo. 

A function fis called meromorphic in an open set M if it is holomorphic in M apart from removable 
singularities or poles. 


Example: The function f represented by f(z) = —- Ta = — - wat 


z : ee | 
ae ey is meromorphic in the whole plane; it has a pole of the first order at each of the 
places z; = —1, zz =i, and z3 = —i. 
If one multiplies a meromorphic function f having a pole of order at most n at zo by (z — Zo)", 
then a power series arises in which a_, is the coefficient of the term a_,(z — Zo)""!. By repeated 
differentiation one obtains from this the residue a_, of the function f 


n—1 _ n 
= a_y;>= SEE bes qa ie — 7) f(z)] : 


If Yo is a closed curve within the domain of definition M of a meromorphic function f and if Zo 
is an isolated singularity of f that does not lie on 79, then for every point ¢ on Yo one can calculate by 
C — z = | — zl] + (cosy +ising) (Fig.) the angle » which the direction from Zz) to ¢ makes 
with the positive real axis. This angle is determined only up to integral multiplies 2k2 of 27, but k 
can be chosen so that g changes continuously when ¢ runs continuously through 7). When ¢ after 


23.1-7 Definition 
of the winding number 
nye, Z¢) 


i : 4 ¥ 
ear: 


—_ 


nly, zo) =7 nly Zo) =-2 nly, 2/20 


WYy.% >) =2 


23.1-8 Examples of winding numbers 


522 23. Complex analysis 


traversing Yo returns to the initial point, gp has changed by 2nx. The winding number n = n(yo, Zo) 

is an integer and depends on the curve yo, the sense of direction, and the position of Zo relative 

to it (Fig.). If yo surrounds zo, has no double point, and is traversed i in the positive direction, then 

by the Laurent expansion f{ f(¢) d¢ = 2ia_,. More generally, the following theorem can be derived. 
Ye 


Residue theorem. { f(¢) d¢ = 2ni J n(yo, Zo) * Res f, provided that yo is a closed curve in the 


ye z fo 
simply-connected domain M and f is holomorphic in M apart from isolated singularities z). The sum 
is taken over the zo. 


Example: The meromorphic function f represented by f(z) = 1/(z+ 1) 
— 1/(z — 1) has a pole each at z; = —1 and z, = +1. In a neighbour- 
hood of z;, —If@ —Il)isa holomorphic function and can be expanded 
in a power series P,, similarly 1/(z + 1) in a neighbourhood of z, ina 
power series P,. Sora Ay ee = 1/(z+ 1)+ P, = —1/(z — 1) + P, one 
obtains Re f=-+land Res f= —1. If 9 is the circle with centre 


—— 


= +1 Mata ‘5 
atz = 2 aad radius 2 traversed in the negative direction (Fig.), then pei vical . 
n(y¥o, —1) = 0 and vn(yo, +1) = —1; hence the residue theorem yields 23.1-9 Application of 
1 1 + = nif -0 itt x de the mes ar to 
ee a d — i ia * — — = i. 
lla F=1) BO Da) fi) sj 


ve 
Holomorphic functions of several complex variables. A function f defined in an open set D in the 


set C” of all ordered n-tuples (z,, ..., Z,) of complex numbers is called holomorphic if every function f 
is holomorphic in which only one of the z, is variable and the other m — 1 are fixed. Hence they 


satisfy the differential equations a = 0; scx, a = 0. The point set {(z1, ..., Zn): |Z; — z9| << Ry, 
j=1,..., n} is called a closed polycylinder, where (z9, ..., 2°) is a fixed chosen point in C". If it lies 


entirely in D, then at all points in its interior {(z,, es Zn): |zy — z9| < Ry, j=1,...,m} the func- 
tion f can be patie by the eerie aed integral formula in which the ‘determining sur- 
face S: {(z1, ---, Z,) 2 |Z; — 29 ,j = 1,..., n} is a subset of the boundary of the polycylinder. 


If D is a domain, that is, a connected open point set, then two functions holomorphic in D are 
equal at every point of D if their values coincide on the determining surface of a polycylinder situated 
in D. 

Locally the holomorphic functions can be represented as the limit functions of a power series 


a Cyl, ccs On (2; — 29)" ee (Zn —= Zoya; 
Wry ony ¥ 


Further generalizations of holomorphic functions arise if instead of holomorphic functions of one 

or several complex variables one considers also complex valued functions that are solutions of 
ow 

general complex partial differential equations, for example, Vekua’s differential equation — a5 = A(z)w 


-+ B(z) W. Here differentiations can also be interpreted in the sense of the theory of distributions. 


23.2. Applications of complex analysis 
aun of real integrals. Certain definite integrals 
f f(x) dx can be evaluated by ‘contour integration’. 
First of all, the Cay principal value of such an integral 


is defined as lim J. f(x) dx, provided that the limit exists. 


R-+ 00 —R 
One now imagines that the part of the real axis between 
—R and +R forms part of a closed curve yg within 
the domain of definition of a meromorphic function f(z) 
whose values, when z is real, coincide with those of the 
given function f(x), and that the curvilinear integral 23.2-1 Path of contour integration 


23.2. Applications of complex analysis 523 


along the remaining part of yg has the limit zero as R- o. If f has poles on the real axis, 
one of which at xo, say, then the point x9 can be excluded from, or included in, the interior of yp 
by a semicircle of radius @ (small) and the integral along the contour yp must be evaluated for 
e — 0 (Fig.). For such poles with Im z) = 0 the factor +-7i therefore occurs in place of 27i, which 
holds for poles Zz) with positive imaginary part Im zp > 0. 


+00 +0o 
1 . ; 
Example 1: | —— dx = > J 7 ay dx= ™ | If one sets in the formula above p,(z)=1, 


7 
=] See ee i) finds Res ———— = lim ———— c= SF ae ae 
P2{2) = 74 =@+i)@-i, “one : Res T+22 acon WE T+27 zai Z-Fi 2i 
webu and so the result. 
+00 
sin x : : ve 
Example 2: I> dx =. The residue theorem is applied to the meromorphic function 
represented by y (1/2) exp iz, which only at z) = O has a pole of the first order. Its residue at z» = 0 
is Res aad = jim fod = lim exp iz = 1. From [te x+ if sin x dx = zi: 1 
Fo z | a! z ra0 
one derives [= OS * dx = 0 and the assertion made above. 
=o 
Connections between complex analysis OF partial differential equations. For a holomorphic func- 
: 7” : ne of(z) 1 [ af(z) df(z) of (z) _ , F@) (z) 
tion f= u * iv one has, by definition, ae > a +i By saad = 0, x ray o 
peapectivelyn = an ahi ia —i > -F a This leads fo the Cauchy-Riemann differential equations. 


Example: The function defined by f(z) = z? = (x* — y*) + 2ixy is holomorphic, therefore 
u(x, y) = x? — y? and v(x, y) = 2xy is a solution of the Cauchy-Riemann differential equations. 


Converse theorem. If the partial derivatives of the first order of the real-valued functions u and v 
exist and satisfy the Cauchy-Riemann differential equations, then f = uw +- iv is holomorphic. 


If one differentiates the first Cauchy-Riemann differential equation with respect to x and the 
second with respect to y, one obtains the Laplace differential equation a ; + os - = 0. Since a 
holomorphic function f has derivatives of every order, the partial derivatives E Y eieaeiy high 
order of u and of v also exist. 


The real part w and the imaginary part v of a holomorphic yams u +B satisfy the Laplace 
differential equation Aw = = +. aoe 0, respectively, dv = tn? oS Sy? = 0. Conversely, 
in a simply-connected domain D for a function uw that is a solution in D of the Laplace differential 
equation there is a function v, uniquely determined apart from an additive constant, which together 
with w determines in D the holomorphic function f = u — ir. 


524 23. Complex analysis 
Example: The real part u(x, y) = x* — y? of the holomorphic function f defined by f(z) = z? 
is a solution of Laplace’s differential equation. 


Conformal mappings. A holomorphic function f defined in a domain D by w = f(z) assigns to 
every point z of Da point w of the w-plane. If y is represented by z = z(t),a<t< }b, and if 


wait = 0(t) exp (if(t)), then the tangent to the curve at the point z(a) makes the angle £(a) with the 
:, t d 
positive real axis. If ve) = 6 exp (ix), then since df) A )) . = Se) ©) ° oo) 
Z™=Zq =a Z=Z¢ =a 


= 60(a) exp [i(6(a) + «)], the tangent to the image curve at the point f(z(a)) makes the angle 

(a) + « with the positive real axis, that is, all angles are rotated by «. Consequently the angle 

between two curves remains unchanged (Fig.). Hence the mapping induced by / is called angle- 

preserving or conformal, more accurately, directly conformal, because the sense of rotation is also 

preserved. If r(t) is the distance between the points z(t) and Zo, and f(t) that between f(z(t)) 

and f(Zo), then lim no = |f’(Zo)| if f’(zo) + 0. This means that in the limit t + 0 distances are 
0 


t— 
multiplied by the factor |f’(zo)| (Fig.). 


23.2-2 The mapping 
by a holomorphic 
function 

w = f(z) is conformal 


f (z(t!) 


23.2-3 Distance ratios for con- 
formal mappings 


23.2-4 Similarity transformation of a triangle by means of 
w=(1+1)z 
+ (U1 —) 


23.2-5 Mapping of the first quadrant onto 
the upper halfplane by w = z? 


Example I: The integral linear function f defined by w = f(z) = az + 6 with complex constants 
a + 0 and b maps every figure of the z-plane, for example, a triangle, to a similar figure in the 
w-plane (Fig.). 

Example 2: Since z = r exp (ip), one has w = z* = r? exp (2ip); consequently the function f 
defined by w = z? maps the first quadrant (Rez > 0, Imz > 0) of the z-plane to the upper 
half-plane (Im w > 0) of the w-plane (Fig.). 


Example 3: If w = f(z) = exp (ic) ——=., where c is a real constant and 2g is fixed with 
\Zo| < 1, then 7 =o 


2 - 119 (z — 20) (Z — Zo) Zi + ZoZ0 — Zoz — ZZo 
\w|? = |exp (ic)|? ————___—__. = 1 -«» ——___——__——_., 
(1 — Zpz) (1 — zoZ) l + ZoZozZ — ZoZ — Zz 
because the absolute value ||? of a complex number €¢ satisfies |¢|* = ¢£. Hence |w| = 1 if and 
only if z7 + zoZo — ZoF — 2% = 1 + ZoZozF — Zoz — ZozZ, that is, |z|? (1 — |zo|*) = 1 — |z0l?, 


or |z|=1. Since |f(zo)| = 0, continuity of arguments shows that |f(z)| < 1 for all z with |z| < 1. 
w + exp (ic) Zo 

] + exp (—ic) Zow ; ent 

one z with |z| << 1. Altogether this means that f maps the open unit disc = |z| << 1 one-to-one 

d/f(z) 


Conversely, since z = exp(—ic) every w with |w| < 1 is the image of precisely 


onto itself. On account of = + 0 the mapping is conformal. If zo = 0, then f(z) = exp (ic) z 


is a rotation around z = 0 by the angle c measured in radian (Fig.). 


23.2. Applications of complex analysis 525 


23.2-6 Rotation about 
z=0 


23.2-7 Mapping of the 
upper half-plane onto the unit circle by w = (z — i)/(z + i) 


i= 


Example 4: The holomorphic function defined by w = f(z) = 


in the upper half-plane 


Im z > 0 maps the latter conformally to the open disc |w| < 1. Since |z — i| < |z + ij for all z 
with Im z > 0, it is true that |w| < 1. The images of real z are points on the unit circle because 
lz — i| = |z + ij. The images of the points 0, 1, co, —1 of the z-plane are the points —1, —i, 1, i 
of the w-plane (Fig.). 


Riemann mapping theorem. Let D be a simply-connected proper part of the complex z-plane. 
Given any point z, of D and any direction at z,, there is one and only one conformal mapping of D 
onto the unit disc |w| < 1 by a holomorphic function w = f(z) with non-vanishing derivative such 
that z> goes over into the centre w = 0 and the given direction at zo into that of the positive real 
axis (Fig.). 


. 

Z=xX+ip i 
i 

U(x, yl=const~/ | 


| : ee EAR) 
23.2-8 Riemann’s mapping theorem : 


23.2-10 Flow in a rectan- 


23.2-9 Stream lines U(x, y) = const \ gular bend 


Problems of stream flow. A stationary flow, that is, one independent of the time, in a domain 
of the x, y-plane can be characterized by the velocity vector [u(x, y), v(x, y)] of a particle following 


) 
the flow. In a flow free of sources and vortices one has ae = Ls and — = = , so that the 
oy Ox Ox oy 


components u and v form a holomorphic function f = u + iv. In a simply-connected domain there 
always exists a primitive F = U + iV, which is then holomorphic. With U(x, y) = const it describes 
the streamlines along which the particles move (Fig.). The velocity vectors are tangents to the 
streamlines. Since conformal mappings carry holomorphic functions into holomorphic functions, 
they are a suitable means of.describing the course of streamlines. 


Example 1: In the upper w-half-plane with Im w > 0 the streamlines are Im w = const. If 
) w= 2? = x? — »? + 2ixy, this half-plane is the biunique image of the first quadrant in the 
: z-plane (see Example 2 on conformal mappings). This means that the hyperbolae Im w = 2xy 
: = const are also streamlines, namely those of the flow in a rectangular bend (Fig.). 

Example 2: The function w = (1/2) (z + 1/z) 
maps the circle |z} = 1 to the segment from 
4-1 to —1 in the w-plane, traversed twice, so 
that the z-points +1, i, —1, —i have the ima- 
ges -+-1, 0, —1, 0. With the exception of this 
segment the w-plane is the image of the exterior 
of the unit circle |z| > 1. In it the parallels 
Im w = const are streamlines. Their originals give 
the stream around the unit circle. Since z 
=r(cosg+ising), 1/z=(1/r) (cos p—ising), 
and w=(1/2)(r + 1/r) cos@ + (i/2) (r— 1/r) sing, 
it can be seen that (r — 1/r) sing = c = const 
are the equations of these streamlines (Fig.). 


(c,=3.0) 
Se Im w=c;/2 


(e,=~1.0) 
: : 23.2-11 Conformal mapping of the streamlines 
The flow around other contours is obtained by (7 — 1/r) sin @ = const mn ihe z-plane onto the 


mapping the exterior of the contour conformally streamlines of the parallel flow Im w = const of 
to the exterior of a circle. the w-plane 


526 23. Complex analysis 


Example 3: Ifh>oO (in the figure A = 2.75), then w = (1/2) (z + A?/z) maps the circle K, of 
the z-plane with the radius r = A and the centre z = 0 to the segment lying between —h and -+-h 
on the real axis of the w-plane (Fig.); for from z = A(cos p + i sin ¢) it follows that w = A cos @. 
If ¢ = A?/z, then w(S) = (1/2) (h?/z + h? - z/h?) = w(z), that is, z and ¢ have the same images 
in the w-plane. The circle K, with the centre M(—1, 1) passing through z = A also passes through 
z = —hiand has the radius 9 = y[(h + 1)? + 1). By ¢ = A?/z the circle K, goes over into another 
circle K, with the centre H(h/(h + 2), h/(h + 2)) and the radius 7 = ho/(h +- 2). The circles K, 
and K,, are mapped to the so-called Zhukouskii profile. Here the point set lying between K, and K,, 
goes over into the point set bounded by the Zhukovskii profile; in general, pairs of points lying 
between K, and K,, go over into one and the same point. Incidentally, the sickle-shaped point set 
lying in the first quadrant between X, and K,,, and also that lying in the fourth quadrant between 
K, and K,, goes over into the point set between the real axis of the w-plane and the part G’— J’— A’ 
of the Zhukovskii profile. The images of circles around (—1, +-1) with increasing radii 9, = 5, 6, ... 
become more and more circular for sufficiently large |z|, because |h?/z| < e. 


In general, such mappings are defined by do/z + a,z + a2z7 + ---. 


23.2-12 Conformal 
mapping of the circles 
Ko and Ky to the 
Zhukovskii profile by 
w = (1/2) (z + A®/z) 


23.3. The total course of complex-valued functions 


The Riemann number sphere. The holomorphic function defined for z + 0 by ¢ = 1/z maps the 
exterior of the circle |z| > R conformally to the punctured disc 0 < |¢|< 1/R, which does not 
contain the point ¢ = 0 (Fig.). But the images ¢ = 1/r exp (—ig) of the points z = r exp (ig) ap- 
proach it arbitrarily closely as r > oo. One therefore interprets ¢ = 0 as the image of the point 
z = oo of the z-plane. An intuitive idea of the point at infinity of the z-plane can be given by means 
of Riemann’s number sphere. The sphere touches at its point S the z-plane at z = 0. The line joining 
the point N diametrically opposite to S to a point z pierces the surface of the sphere at the image 
point P of z. The mapping z— P is one- ao | 
to-one, and the point N is regarded as the 
image of the point at infinity in the z-plane 
(Fig.). 


23.3-1 Mapping of 
the exterior of the 
circle with radius R 
onto the interior of 
the circle with radius 
1/R by ¢ = 1/z 


23.3-2 The Riemann number sphere; 
r = 2/(R sin 6(1 — cos 9)] 


23.3. The total course of complex-valued functions 527 


The rays on which a point z = r exp (ig) and its image ¢ = 1/r exp (—ig) lie go over one into one 
another by the reflection Z = z. This is an indirect conformal mapping under which the sense of 
rotation of the argument @ is reversed. The transformation by reciprocal radii € = 1/Z = 1/Z 
= 1/r exp (i¢) is therefore indirectly conformal. Original and image point lie on the same ray. The 
quantity 1/r can easily be constructed elementarily (Fig.). 


23.3-3 Mapping ¢ = 1/z by 
reciprocal radii; |z| = rv implies 
that |¢| = 1/r and [Z| - |e] = 1 


23.3-4 Analytic continuation 


Riemann surfaces. If the power series P,, = = a,(z — zo)/ converges in the disc K;,: |z — 29|< Ro 


then it defines in it a holomorphic function ra If z, lies in K,,, then in accordance with (z — Zo) 
= [(z — z,) + (z, — Zo)] the power series P,, can be rearranged into a power series P;,, which 
also converges and represents the same function in the intersection K,, ~ Kz,. If P:, converges not 
only in K;,, but also at points z in a disc K;, lying partly outside K,,, then f has been continued 
analytically by P,, (Fig.). By analytic continuation in every possible way one obtains the complete 
analytic function generated by the power series P,,. It can happen that this assigns to every point 
of the z-plane exactly one function value f(z); but there can also be points z for which one obtains 
different function elements according to the path on which they are approached. To avoid this 
ambiguity one imagines that every function element is defined in an individual copy of the plane, 
a sheet, so that over such places z the complete analytic function is defined uniquely in a correspond- 
ing number of sheets. This covering surface or sphere is called a Riemann surface R. 

For example, the function defined by w = /z = rexp (ig/2) can be continued analytically from 
the positive real axis in the sense of increasing 9 or the negative sense of decreasing y. At the negative 
real axis one obtains, according to the sense of rotation, the values w*(z) = rexp (iz/2) and 
w (—2) = rexp (—iz/2), respectively (Fig.). After a further complete circuit these values inter- 
change, because w*(37) = exp (3in/2) = r exp (—in/2) and w-(— 32) = r exp (—3in/2) = r exp(in/2), 
respectively. The Riemann surface of this function therefore has two sheets (Fig.) which are cut each 
along the negative real axis and are then pasted together crosswise so that the upper boundary 
of the cut in each sheet is connected with the lower boundary of the other sheet. Then the function 
values on the Riemann surface go continuously into one another. In the branch points the two sheets 
hang together; from the z-sphere one sees that for w= )/z both z= 0 and z = ~ are branch 
points. 


i Pm ie 23.3-5 The values w*(=) 
oa al fizj=vz ™ oN and w~() of the function 
cl w = |z at the negative real 

J axis 


23.3-6 Riemann surface of 


w=/r 


Uniformization. A Riemann surface R is a covering surface of the z-plane or the z-sphere. For 
R a further covering can be constructed, the universal covering surface. If one starts out from a 
definite point Pp on R, all the points P of the universal covering surface are to be the end-points of 
all possible curves beginning at Po; but two curves y, and 72 are to lead to the 
same point P only when they can be carried continuously into one another within 
R; in other words, if there is a curve going from Po to P that cannot be carried 
continuously into 7; or y2, then it defines another point of the universal cover- 
ing surface. The example of the annulus shows that such curves 7 can occur (Fig.). 


23.3-7 Two curves y, and y, define the same point of the universal covering sur- Z 
face if and only if they can be carried continuously into one another on the Riemann a - 
surface R 


528 23. Complex analysis 


It can then be shown that the universal covering surface is simply-connected and that the following 
generalization of Riemann’s mapping theorem holds: 


Generalized Riemann mapping theorem. The universal covering surface of every Riemann surface 
can be mapped one-to-one and conformally onto the interior of the unit disc, respectively, onto the 
whole complex plane or the Riemann sphere. 

Since a Riemann surface is part of its universal covering surface, every Riemann surface can be 
put into one-to-one correspondence with a subset of the Riemann number sphere. This is called 
the possibility of uniformizing a Riemann surface; an example is the uniformization of the Riemann 
surface of the integrand of an elliptic integral (see Elliptic integrals). 


Distribution of values. For a holomorphic function f statements can be made on the frequency 
with which certain values are assumed. A point Zo is called a k-fold wo-place of f if f(zo) = Wo and 
f'(z%) = 0,7 ..., f&— (zo) = 0, but (zo) + 0. The following theorem holds: 


When multiplicity is taken into account, a polynomial p(z) = a) + a,z +-:: + a,z" with a, + 0 
assumes every complex value wo at exactly m places (Fundamental theorem of algebra). 


For example, the polynomial p(z) = z? assumes the value wp = +1 at z= 1 and z= —1; the 
value wo = 0 is assumed at z = 0 only, but with multiplicity 2, because p’(z) = 2z and p’(z) = 2, 
so that p’(0) = 0 and p’(0) + O. A stronger statement than that of the theorem of Casorati-Weier- 
strass is the theorem of Picard. 


Theorem of Picard: A complex-valued function having an essential singularity at zo assumes in 
every neighbourhood of zo every complex value with at most one exception. 


23.4. Elliptic integrals 


The Weierstrass g-function. A function f defined for all z is said to be periodic of period @ if 
f(z + w) = f(z) for all z. A function fis said to be doubly-periodic if there are two complex numbers 
W,, 2, for which w/w, is not real, such that all numbers w = k,w, + k2w2 with arbitrary integers 
k,, k, determine the set of all periods, that is, the lattice of periods with f(z + w) = f(z) for all z. 

Let a be any complex number. Then the numbers a, a + w,,a-+ @, + 2, a-+ 2 are the vertices 
of a so-called period parallelogramm. A doubly-periodic function f assumes all its values within 
any period parallelogram. An elliptic function is a doubly-periodic meromorphic function. An el- 
liptic function that is not a constant must have poles. But the sum of the residues inside any periodic 


parallelogram is always zero. For if the integral is taken around the boundary of such a parallelogram, 
then 


at+@, a+WitWs aA+Ws a 
ff@dz= “S"¢@az +” f F@de+ f fede + J fle) dz = 0; 


for the substitution of z+ @ for z in the third integral shows 
that its value is the negative of that of the first, and similarly 
for the other pair. Here it is assumed that there are no poles on 
the boundary, but this can always be achieved by a suitable 
choice of a (Fig.). Therefore there are no elliptic integrals with 
only one pole of the first order in the period parallelogram. 
The simplest elliptic function is the Weierstrass g-function 


1 
ee a ne S| 


kaks | (2 — kyo, — k2w2)? (kyo, + k2@2)? 


23.4-1 Period lattice and period parallelogram with the vertex a 


here the dash at the symbol indicates that the term with k, = 0, kz = 0 is to be omitted from the 
summation. It has a pole of order 2 with the residue 0 at every lattice point. Its derivative p’(z) is 
also elliptic and has zeros of order 1 at w,/2, w2/2, and (@, + @2)/2. In a neighbourhood of z = 0 
the Laurent expansions of the g-function and its derivatives can be given in terms of the principal 
part and a holomorphic function h: 


p(z) = 1/27 + Az), p(Z) = —2/22 + hz), pz) = 6/2* + (2). 


The expansion in a series leads to the differential equation gp’? = 4p? — g2 p— g3, in which the 
following abbreviations are used: 


g2 = 60 >” . and g3; = 140 3” I 


a. (kyw, + k2w2)* kak, (ky@, + k2@2)® © 


23.4. Elliptic integrals 529 


The inversion problem for the g-function requires, for given values g2 and g3 with g3 — 27g2 + 0, 
to find a periodic lattice such that for the associated g-function the quantities g. and g3 assume the 
prescribed values. 


If at N places z,;, j = 1,..., N, inside a period parallelogram the principal parts of the Laurent 
N 
expansions are prescribed so that the residues satisfy 2’ a‘) — 0, then apart from an additive constant 


j=l 
there exists one and only one elliptic function f having these principal parts. Apart from this constant, 
it is a linear combination of the g-function, its derivatives, and the primitive € of —g. But this 


function ¢ is not elliptic, because it has poles of order 1 at the lattice points and is otherwise holo- 
morphic. 


The Weierstrass normal form of an elliptic integral. In an elliptic integral the integrand rat (z, w) 
is a rational function of z and w; here w is the square root of a polynomial p,4(z) of degree 4 or 
P3(z) of degree 3 in z, and the 4 or 3 zeros of these polynomials are simple, that is, distinct. In the 
z-plane the integrand is determined to within a factor +1, and only on the two-sheeted Riemann 
surface of w = //p(z) it is uniquely determined. Its four branch points are the four zeros e,, é€2, 
€3, €4 Of p4(z), or the three zeros of p3(z) together with the point z = ov. The path of integration y 
is situated on this Riemann surface. By the substitution z’ = 1/(z — e4) one reduces p,4(z) to p3(z). 
By a translation one can achieve that the centre of gravity of the triangle formed from the remaining 
3 zerOS €;, €2, €3 lies at z = O (Fig.). Then e, + e2 + e3 = 0, and by Vieta’s root theorem, apart 
from a constant factor, p(z) = 4z? + c,z + c,. This leads to the Weierstrass normal form of the 
elliptic integral. By solving the inversion problem of the g-function for these values —c, and —c2, 
two periods w, and w, can be found in a Z-plane such that the relevant period lattice determines a 
g(Z)-function for which gz = —c, and g3 = —c2. By z = g(Z) every period parallelogram of the 
zZ-plane is mapped one to one onto a sheet, and the whole Z-plane onto the universal covering 
surface of w = /p(z). According to the differential equation for the g-function and since z = p(Z) 
it follows that gp’? = 4p? — g29 — g3 = w’, that is, p’(Z) = w. If f is the 
inverse image in the Z-plane of the curve y in the z-plane, then the elliptic 
integral in the Z-plane is: 


J rat (z, w) dz = f rat (g, p’) p’(Z) dz. 
Y Y 


ey 


23.4-2 Zeros e;, €, €s With e,; + eg + es = Oof p(z) = 422 4+ az+4+ cy Cy 


The integrand is said to be of the a) first, b) second, or c) third kind according as the integrand in 
the Z-plane is an elliptic function a) without poles, b) having poles but such that their residues are 
all zero, c) otherwise. In case a) the integrand is therefore a constant in the Z-plane, so that 
rat (g, p’) = const/p’ and rat (z, w) = const/w. Since w* = 4z3 — g.z — g3, one obtains for the 
elliptic integrals: 


The Legendre form of an elliptic integral arises when w? = (1 — z*) (1 — k?z?) instead of 
w? = 473 — g.z — g3; k is called the modulus. 


530 24. Analytic geometry of space 
24. Analytic geometry of space 


24.1. Coordinate systems .............. 530 24.2. Limear spaces .............0..e00% 535 
Rectangular coordinates ........... 530 SOLON ciwtc kee ncekineansenea es 535 
Oblique coordinates ...........0465. 531 TANG 3s oa eas 6 ode eae sk eaees 536 
Homogeneous coordinates ......... 531 Nf (1) a ee ee ee ee re 539 
Spherical coordinates ............. S531. 24.3. QuadricS ..066ccesescascaseueves $43 
Cylindrical coordinates ............ 532 Principal QX€S 0.0... cece ewes 543 
Transformations of coordinates ..... 533 Proper quadricS .........0ccce00. 544 


The essence of analytic geometry of space consists in setting up a correspondence between the 
points of the space and real numbers. Curves (1-dimensional manifolds) and surfaces (2-dimensional 
manifolds) then correspond to solution sets of equations, and geometrical constructions can be 
replaced by algebraic and analytic methods. Since these methods form the basis of analytic geo- 
metry, the subject did not arise until progress was made in algebra and analysis. 


24.1. Coordinate systems 


Rectangular coordinates 


Setting up a system. Coordinate systems are the ‘middlemen’ between points and numbers. 
To set up a rectangular or Cartesian coordinate system in space, the first thing to do is to choose 
a point of space as the origin. Through this point three mutually perpendicular lines are drawn. 
These are called the coordinate axes, usually the x-, y-, and z-axis. The three coordinate axes span 
the three coordinate planes in space, the x,y-, x,z-, and y,z-plane. Any two axes divide the coordinate 


plane spanned by them into four quadrants, and the three coordinate planes divide space into eight 
octants. 


Orientation. On each coordinate axis a unit vector is fixed: on the x-axis the vector i, on the 
y-axis the vector j, on the z-axis the vector k. The coordinate axes are directed by these vectors, 
and the coordinate system is oriented (Fig.). 


24.1-1 Rectangular coordinate system, 
right-oriented 


24.1-2 Orientation is reversed in a mirror 


That part of each coordinate axis which contains the end-point of the unit vector from the origin 
on that axis is called the positive axis, the other one the negative. Any two positive axes bound a prin- 
cipal quadrant, and the three principal quadrants bound the principal octant. 

One can always point the thumb, index finger, and middle finger of one hand in the directions 
of i (thumb), j (index finger), and & (middle finger). If this is done on the right hand, the system is 
called right-oriented or a right-system, otherwise a left-system. By reversal of one axis or by reflection 
in a plane a right-system goes into a left-system and vice versa (Fig.). 


Points in space. If a rectangular coordinate system is chosen, then to every point of space there 
corresponds uniquely a triple of real numbers, and conversely, to every triple of real numbers there 
corresponds a unique point of space. The three numbers corresponding to a point of space are called 


24.1. Coordinate systems 531 


the rectangular or Cartesian coordinates of the point. To determine 
the rectangular coordinates of a given point P one drops perpen- h2 
diculars from P to each of the coordinate axes and measures the 
oriented lengths of the projections in units corresponding to the 
lengths of the fixed unit vectors. The values so obtained are the 
coordinates of P. To determine a point P from given coordinates 
x, y, Z, one uses the vector notation. Starting from the origin, the 
vector x = xi+ yj + zk leads directly to the required point P 
(Fig.). Its distance from the origin can be calculated by means of 
the theorem of Pythagoras. It is |x| = //(x? + y? + z?). 


Example: The rectangular coordinates x = 3, y=4,z=12 
of a point P are given. This is written briefly as P(3, 4, 12). 


The distance of this point from the origin is eae ne 
|x| = V(3? + 4? + 127) = V9 + 16+ 144) = 169 = 13. b) the vector x = xi + yj + zk 
The distance of P from the origin is 13 units. ends at P 


Oblique coordinates 


An oblique coordinate system is a generalization of a rectangular one. To set it up one takes any 
three lines through the origin that do not lie in one plane and prescribes a vector on each of these 
lines. Then all the results for a rectangular coordinate system hold word for word, with the follow- 
ing exceptions: 

1. The numbers corresponding to a point of space are no longer called rectangular coordinates, 
but more generally parallel coordinates. 

2. To determine the parallel coordinates of a given point P, one draws a parallelepiped whose 
edges are parallel to the coordinate axes and which has the origin and P as vertices. The oriented 
measures of the lengths of the edges lying along the coordinate axes are the parallel coordinates 
of P. 

3. The vector x = xi+ yj + zk, starting from the origin leads to the point P, as in a rectangular 
coordinate system, but as a rule |x| + /(x? + y? + 27), since the theorem of Pythagoras does not 
hold in a general triangle. 


Homogeneous coordinates 


In projective geometry it is required that two lines in a plane always have a point of intersection, 
and also in space, if the lines are not skew. Hence an improper point, or point at infinity, is introduced 
as the ‘point of intersection’ of parallel lines. Analytic geometry, in the form established up to now, 
cannot cope with points at infinity. One possibility of satisfying the requirements of projective 
geometry and also of analytic geometry consists in the introduction of homogeneous coordinates. 

If x’, y’, z are the parallel coordinates of a point P in space, then numbers x, y, z, t determined 
by the equations x’ = x/t, y’ = y/t, z’ = z/t are called homogeneous coordinates of P. This quadruple 
is not uniquely determined. If x, y, z, t are homogeneous coordinates of a point of space, then for 
any real number @ =+ 0, ox, ey, ez, et are homogeneous coordinates of the same point. Conversely, 
to any homogeneous coordinates x, y, z, t¢ with ¢ + O there corresponds a unique triple of parallel 
coordinates. The reverse transformation results by simply going over to x/t, y/t, z/t. 

Example. The point P(2, 3, —1) has homogeneous coordinates x = 2s, y = 3s, z = —s,t=s 

with an arbitrary s + 0. 


Some problems are soluble in homogeneous coordinates, but not in parallel coordinates. For 
example, if y’ = ax’ + 5, and y’ = ax’ + b2 with 5; + 5b, are two parallel lines in the x’, y’-plane, 
then no point of intersection exists in parallel coordinates. In homogeneous coordinates the same 
lines have the equations y = ax + bt and y = ax + bot. 

This system of equations is soluble: in fact, it has infinitely many solutions x = oe, y = ao, t = 0. 
For any 0 + Othe triple g, ag, 0 represents the homogeneous coordinates of a point of the x’, y’-plane, 
a ‘point at infinity’, which is common to the two lines and characterizes their common direction. 


Spherical coordinates 


Setting up spherical coordinates. It is convenient for certain problems, for example, those con- 
cerned with the surface of a sphere, to introduce non-parallel coordinates. Instead of determining 
an arbitrary point P of space by rectangular coordinates x, y, z, it can also be determined by 


1. the distance r > 0 of P from the origin O, 
2. the angle m that the segment OP makes with the x, y-plane (—2/2 < » < +2/2), 


532 24. Analytic geometry of space 


3. the angle A that the projection OP’ of the segment OP onto the x, y-plane makes with the 

positive x-axis (0 <A< 27). 

The figure shows the sense of rotation of angle measurement. 

The values r, y, A are called the spherical coordinates of the point P. They correspond to polar 
coordinates in the plane and are therefore also called spatial polar coordinates. 

Every triple of spherical coordinates corresponds to exactly one point of space. To a point P 
of space there corresponds a unique triple of spherical coordinates if P does not lie on the z-axis. 
On the z-axis, except for the origin, only r and » (= +7/2) are uniquely determined, and A is un- 
determined. If P is the origin, only r = 0 is uniquely determined, and 9 and A are undetermined. 


Conversion between rectangular and spherical co- 
ordinates. From the figure one obtains the relations 


x = |OP’|cosA, y=|OP’| sind, |OP’|=rcos@. 
The rectangular coordinates of a point of space 
can therefore be calculat- 
ed from the spherical co- 
ordinates by the adjacent 
formulae. 

It follows that 

x? + y? + 27 = P?, 

x/V(x? + y?)=cosd, y/V(x? + y?) = sind, 
z/V(x? + y*) = sin g/cos p = tang, 

y/x = sin A/cos A = tan dA. 

Hence the spherical coordinates of a point of space 


can be obtained from the rectangular coordinates 
by the formulae 


r= V(x? + y? + 2), 

gy = Arctan z//(x? + y?) (for x? + y? + 0), 
A = Arctan (y/x) (for x >0,y > 0), 

A=x2-+ Arctan (y/x) (for x < 0), 


24.1-4 Spherical coordinates r, 9, A4of a point 


A = 2n + Arctan (y/x) (for x > 0, y < 0). P in space 

Furthermore, 

g =2/2forx=y=0,z>0; A=n/2 for x =0, y>0; 

gy = —2/2 forx => y= 0,2< 0; A = 3n/2 for x = 0, y << 0; 

gy is undetermined for x = y = 0, z= 0; A is undetermined for x = y = 0. 


Arctan, as always, is the principal value. 


Example. What are the spherical coordinates of the point P(3, —4, —12)?- 
r= (3? + 47 + 127) = 13; 
gm = Arctan —12/)/(3? +- 4*) = Arctan (—12/5) = —67.38°; 
A = 360° +- Arctan (—4/3) = 360° — 53.13° = 306.87°. 
The spherical coordinates of P are therefore r = 13, p ~ —67.38°, and A = 306.87". 


Cylindrical coordinates 


For problems on the surface of a cylinder it is convenient to introduce cylindrical coordinates 
(Fig.). Starting from a rectangular coordinate system an arbitrary point P of space can be deter- 
mined by 


1. the distance r > 0 of P’ from the origin O, where OP’ is the projection of the segment OP onto 
the x, y-plane, 

2. the angle y that the segment OP’ makes with the positive x-axis (0 < » < 2m), 

3. the oriented distance z of the point P from the x, y-plane (—oo < z << + 00), 


To every triple of cylindrical coordinates there corresponds exactly one point of space. Again, 
to a point P of space there corresponds a unique triple of cylindrical coordinates if P does not lie 
on the z-axis. For points on the z-axis, r = 0 and z is determined, but ¢ is undetermined. 

Cylindrical coordinates are often used in physics when cylindrically formed bodies are 
to be investigated, for example, in the calculation of the moment of inertia of a cylinder or in problems 
of heat conduction in cylindrical bodies. 


24.1. Coordinate systems 533 


The cylindrical coordinates r, y, z of P coincide with the polar co- 
ordinates of the point P’ in the x, y-plane and the rectangular z-co- 
ordinate of P. This gives the conversion formulae. Those for g hold 
only if x? + y? + 0; p is undetermined if x = y = 0. 


Example. Given the cylindrical coordinates r= 3, gp = —30°, 
z= 1 of a point P, to find the rectangular coordinates. 


x = 3 cos (—30°) = 3 cos 30° = (3/2) V3 = 2.598, 
y = 3 sin (—30°) = —3 sin 30° = —3/2 = —1.5, 


= i 
z= 1. 24.1-5 Cylindrical coor- 
dinates 7, ¢, z of a point P 
Transformations of coordinates In space 


If two coordinate systems are given in space (henceforth they will both be rectangular right- 
systems with the same unit of length) and they do not coincide, the problem often arises of calculating 
the coordinates x*, y*, z* of a point P in one system from the coordinates x, y, z of P in the other 
system. Such a conversion of coordinates is called a transformation of coordinates. Three cases cah 
be distinguished: translation, rotation and a combination of the two. 


Translation. The two coordinate systems are so situated in 
space that one can be brought into coincidence with the other 
by means of a parallel shift (Fig.). 

If the origin O* of the second system has coordinates a,, a2, 
a3 with respect to the first system with origin O, then the given 
relations hold between the coordinates x, y, z of a point P of = 
space with respect to the first system and the coordinates x*, y*, 
z* of P with respect to the second system. 


24.1-6 Translation of a coordinate 
system 


Example. All points whose rectangular coordinates satisfy the equation 3x + 2y — z= 5 
lie in a plane in space. What is the equation of this plane with respect to a coordinate system whose 
origin has the coordinates a; = —5, a; = 2, a3; = 7 with respect to the first system? — x = x* — 5, 
y= y* + 2,z= z* + 7, so that 3x + 2y — z = 5 goes into 3x* + 2y* — z* = 5+ 15—4+47. 
With respect to the new system the equation of the plane is 3x* + 2y* — z* = 23. 


Rotation. The two coordinate systems have the same point of space as origin (O* = O), but 
their axes have different directions. 

In this case each axis of one system makes an angle with each axis of the other system. The cosines 
of these angles are denoted by a,,, where i and k run through the values 1, 2 and 3. The first index 
always refers to the x, y, z-system and the second index to the x*, y*, z*-system. The index | cor- 
responds to the x- or x*-axis, 2 to the y- or y*-axis and 3 to the z- or z*-axis; that is, 

Q4, = COS (x, x*) a,2 = cos(x, y*)  a13 = COs (x, 2*) 
a2, = COs(y,x*) a22 =cos(y, y*) 23 = cos (y, 2*) 
a3, = Cos (z,x*) a32 =cos(z, y*)  a33 = cos (Zz, z*). 

The coordinates of an arbitrary point then transform according to the following equations. The 
a,, are called direction cosines. The given equations of transformation are derived and further 
discussed below. It should be noted that the system of equations on the right contains the same 
coefficients as that on the left. Their positions are interchanged by a reflection in the main diagonal 
(top left to bottom right) of the coefficient matrix. 


534 24. Analytic geometry of space 


Combination. The two coordinate systems do not have the same origin and cannot be brought 
into coincidence by a parallel displacement alone. This case is a combination of the two cases 
considered above, and therefore leads to the following equations of transformation. 


All transformations that lead to a uniquely soluble system of linear equations are called affine 
transformations. 

All the given equations of transformation can be interpreted as formulae for changing the co- 
ordinates of a point by a motion in the fixed space (translation, rotation or a combination of the 
two) of the coordinate system. However, they can also be regarded as the analytic representation 
of a motion of space with the coordinate system fixed. 


Derivation of the equations of a rotation. The system of equations for a rotation can be derived 
as follows. The vector x of an arbitrary point P is given in the first system by x = xi + yj + zk 
and in the second by x = x*i* + y*j* + z*k*. If one writes x in the first system in the form 
x = |x| [(x/|x|) i+ (y/|x|) 7 + (Z/|x|) A] and first treats the special case where this vector is equal 
to i* in the second system, that is, x* = 1, y* =0, z* =O, then, by definition, x/|x| = a,,, 
yl |x| = az1, z/|x| = ag1. i* =a,,it+a,j+ k 

Similar results are obtained from the special cases x = j* andx=k*. 4, — 12, 21d 1 431%, 
This gives Jo = 412! Ss Q22J 1 a32k, 

k* = a13i + Q23j + a33k. 
If these expressions are substituted in x = x*i* + y*j* + z*k* and the result is equated to 
x = xi+ yj + zk, the system of equations on the left for a rotation is obtained. The system on the 
right can be obtained similarly. 

Relations between the direction cosines. These relations are obtained from the expressions for 
i*, j*, k* because of the fact that these vectors are unit vectors, |i*| = |j*| = |k*| = 1, and are 
mutually perpendicular, i*j* = i*k* = j*k* = 0. 


One can obtain further relations by taking into account the fact that i, 7, A are unit vectors and 
are mutually perpendicular; however, it can be shown that there are only six independent relations 
between these direction cosines. 

Since there are six independent equations connecting the nine direction cosines that characterize 
a rotation, a rotation can be completely described by means of three quantities. This was shown 
quite generally by CayLey. This is the basis for two particularly intuitive ways of characteriz- 
ing a general rotation, namely by three angles (the Euler angles) or by an axis and one angle 
(Euler’s theorem). 


24.1-7 Rotation of a coordinate system 


Rotation of the coordinate system. A rectangular coordinate system with the axes x, y, z can always 
be brought into coincidence with a second rectangular coordinate system with the same origin and 
axes x*, y*, z* by first rotating about the x-axis through an angle g, then about the y-axis through 
an angle y and finally about the z-axis through an angle x (Fig.). 


24.2. Linear spaces 535 


Example of a rotation. On the surface of a sphere, whose centre is chosen as the origin of a 
rectangular coordinate system, there lies a point P(—4, 8, —16). If the coordinate system is rotated 
anticlockwise about the x-axis through an angle g = 30°, about the y-axis through an angle 
y = 45° and about the z-axis through an angle 7 = 60°, one obtains the direction cosines by the 
given formulae and therefore the new coordinates of P. 


a, = y2/4, a1. = — 6/4, a3 = 2/2, 

a3) —_ 3/4 + 2/8, 32 — 3/4 =r y6/8, 34 _ —y2/ s 

a3, = V3/4— V6/8, 432 = 1/4 +3 2/8, a33 = 6/4, 
x*=6—4)3+2)/6#3.97, y*=—4-—6)2+2y)3 = —9.02, 


The following theorem, which is given without proof, is of great importance in mechanics. 

Euler's theorem. If two rectangular coordinate systems with the same origin and arbitrary directions 
of axes are given in space, one can always specify a line through the origin such that one coordinate 
system goes into the other by a rotation about this line. 


Applied to a rigid body, Euler’s theorem can be stated as follows: 


For a rigid body, of which one point O is fixed relative to a system of reference, if a possible 
initial position is given and any other as final position, one can always specify an axis through O 
such that the body can be taken from the initial position to the final position by a rotation about 
this axis. 


It is impossible to move a sphere whose centre is fixed in space so that at the end of the motion 
the position of all points on the surface of the sphere is different from their original position. In 
fact, either two or all the points must lie in their original positions. 

Any (x, y, z)-coordinate system can be brought into coincidence with a second (x*, y*, z*)-co- 
ordinate system with the same origin by rotation through angles y, gy, @ (Fig.), where k is the line 
of intersection of the x, y-plane and the x*, y*-plane. These angles are called the Euler angles. 


24.1-8 The Euler 
angles y, 9, 0 


24.2-1 Components 
of a segment in space 


24.2. Linear spaces 


Segment 


Length and orientation. The segment P,P, on the line through two points P, and P, is the set 
of all points lying between P, and P, inclusive (see Chapter 7.). If the points lie on an 
oriented line, such as Q,, und Q,, on the x-axis of a spatial Cartesian coordinate system (Fig.), 
one regards the segment Q,,Q,, as a directed quantity or vector, which is taken to be positive or 


536 24. Analytic geometry of space 


negative according as its direction agrees or does not agree with the direction of the x-axis, in other 


——_—__> —_——_—> ——__» —_—> 
words, one sets |Q11Q12| = x2 — x, and |Q12Q11| = x1 — x2 So that |Q;1Q1:2! = —|Q12Q11|. The 
length of the segment is denoted by | Q;:Q12]. 

If two points P; and P2 in space are given by their Cartesian coordinates (x, , y,;, Zz,;) and (x2, ¥2, Zo); 
then the three planes parallel to the coordinate planes through P, and P,, respectively, cut off on 
the coordinate axes the segments Q::Q012, Q21Q22 and Q3:Q32, which are called the components 


of the segment P,P. If the segment is oriented as P,P2, say, then its components are Q;;Q:2, 


_ > —_—__> . . 
Q2:Q22 and Q3:Q32. In the right-angled triangle PoP, P, the length |P,P,| of the segment can be 
calculated by Pythagoras’ theorem. 


Example: The length |P,P;| of the segment P,P, between the points P,(5,2,—1) and 
P,(—3, —2, 0) is calculated to be ) 
|P,;P2| = V(—3 — 5)? + (—2 — 2)? + 0 + 17] = (8? + 47 + 17) = y81 = 9. 

The segment P,P; is 9 units long. 

The distances between three arbitrary points P, , [ Triangle inequality | |P,Ps| < |P:Pa| + |P2Ps| | 
P, and P3 in space satisfy the triangle in equality. 

The sign of a directed segment on an oriented line is obtained analytically as the scalar product 
of its vector with the unit vector on the oriented line: for example, Q;,;Q12° i= x2 — x; >0 or 


Q2:022°J = ¥Y2 — 1 <. 0. If the line determined by two points P, and P, has the orientation 
=; * ‘ . >; . 7 . é 
P, P2, then its unit vector is e = P,P2/|P,P,|. IfQ, and Q, are two arbitrary points of a line oriented 


by P,P2, then an oriented distance |QiQ2| is fixed by the following agreement:|Q,Q2| = +|Q,Q>| 


if the orientation Q,Q, corresponds to that of P,P2, but |Q,;Q2| = —|Q,Q_| if the orientation 
Q,Q>, corresponds to that of P,P;. 


Division. Suppose that a given line is oriented by P,P, and that P is an arbitrary point on it 
other than P,. One says that P divides the oriented segment P,P in the ratio 4 = |P, P|: |PP3| 
and calls A the ratio of division. In particular, one speaks of inner division when P lies between P, 
and P,: then A> 0. If P lies outside the segment P,P, one speaks of outer division, and A< 0. 
The midpoint of a segment always has the ratio of division A = 1 with respect to the end-points. 

° «ee ; F 7 : . ——> 

The point of division P is uniquely determined with respect to an oriented segment P,P, by the 
ratio of division A. If (x1, y1, 21) and (x2, ¥2, Z2) are the coordinates of P, and P,, then P has the 
coordinates x = (x; + Ax2)/(1 + A), y = G1 + Ay2)/(1 + A), Zz = (2, + Az2)/1 + A). 


Example: The oriented segment P,P; with P,(5, 2, —1) and P,(—3, —2, 0) is to be divided 
in the ratio A = —5. The coordinates of the point of division are 
x = (5 + (—5)(—3))/1 — 5) = 20/(—4) = —5, 
y = [2 + (—5) (—2))/(1 — 5) = 12(—4) = -—3, 
z= [—1 + (—5) (0O)/(1 — 5) = —1/(—4) = 1/4. 
Since A = —5 < 0, it is an external division. 


Under parallel projection of the points of one ray onto 
any other ray the ratio of division of points remains un- 
changed (Fig.). 


T | 


ces 
jA=t 7/ 


24.2-2 Ratio of division remains unchanged 
by a parallel projection 


p! PP a, 
[Ae-g] tty 


24.2-3 The direction angles of a line 


Line 


Direction cosines. The direction cosines of an oriented line are the cosines of the angles (direction 
angles) that a line through the origin parallel to the given line with the same orientation makes with 
the positive coordinate axes (Fig.). These three angles are two-valued, according to the sense of 
rotation, but their cosines are uniquely determined, since cos « = cos (2% — a). 


24.2. Linear spaces 537 


Relations. If an oriented line with the direction angles «, 8 and y (with respect to the x-, y- and 
z-axes) passes through a point P, with the coordinates (x,, Yi». z,), and if P(x, y, z) is an arbitrary 
point of the line, then x = x, + |P,P|cosa, y= y, + |P,P|cosB, z= 272, + \P, P| cosy. The 
proof is simple. One first takes a rectangular coordinate system with P, as origin and then carries 
out a translation. It follows that (x — x,)* + (vy — 1)? + (z — z,)? = |P,P2|? (cos? « + cos? B 
+ cos? y). By taking into account the standard formula for 


|P, P| one obtains the fundamental relation between the direc- [ cos? x + cos? B + cos? y = 1 | 
tion cosines of an oriented line. 


Conversely, any three numbers a, b, c for which a? + 6b? + c? = 1 can be regarded as the cosines 
of the direction angles of an oriented line in space. 

Calculation of the direction cosines from two given points. If P, and P, are two points of space 
with the coordinates (x;, ¥;, 21) and (x2, y2, Z2), then the cosines of the direction angles «, 8 and 
y of the oriented line that goes from P, to P» are given by the following formulae. 


Example: To determine the direction cosines of the oriented line that goes from P,(5, 2, —1) to 
P,(—3, —2,0). One has y[(x2 — x1)? + (y2 — ¥1)? + (22 — 21)7] = V(8? + 47 + 17) = 9; 
hence cos « = (—3 — 5)/9 = —8/9, cos 8 = (—2 — 2)/9 = —4/9, cos» = [0 — (—1)]/9 = 1/9. 
As a check one can verify that the sum of the squares of the cosines is equal to 1. 

Equations of a line. By introducing the vectors x = xi+ yw/¥+ zk, x, = x,it+ yd + zk, 
e = cos «i + cos Bj+cosyk the equations x= x,+ |P,P|cosa, y= y, + |P,P| cos 8B, 
z =z, + |P,P|cosy can be put into the simple form x = x, + |P,P| e. The vector e is called a 
direction vector and is a unit vector. Occasionally a multiple of e is taken instead of e. Then, instead 
of e, the letter a will be used. 

If one writes generally x = x, + ta, then for given x, and a to any real number ¢ there corresponds 
a vector x, which goes from the origin to a point of the line. As ¢ runs through all numbers from 
—co to +00, one obtains all the points of the line. Conversely, to any point of the line there corre- 
sponds a number ¢ such that the vector x = x, + ta ends at the point. Therefore x = x, + fa is 
called the point-direction equation of the line or, since ¢ is called a parameter, a parametric repre- 
sentation of the line (Fig.). 


7 | For a parametric representation of a line it is important only that 

—---=== to each value of the parameter in (—ov, +0c) there corresponds a 
unique point and conversely, and not in which sense the line is de- 
scribed as ¢ goes from —co to +00. Hence the sense of direction of a 

___—, Plays no part, and a need not be a unit vector. 

, 


24.2-4 The point-direction equation of a line 


Example: The point-direction equation of the line in the previous example is to be found. Hence 
FP, and the direction cosines are to be regarded as given 
x, => 5i+2j—k and e= —(1/9)(8i+ 4j — &). 
The point-direction equation is therefore 
x = (Si + 2j — k) — (t/9) (81 + 4j — &). 
If a new parameter u = —1/9 is taken, then the equation is 
x = (Si + 2j — k) + u(8i + 47 — &). 
It is still a parametric representation of the given line, but its direction vector is no longer a unit 
vector and it no longer has the orientation of e. 


Two points are given. If P, and P, with the coordinates (x,, y,,2Z,) and (x2, y2,Z2) are two 
given points of a line, let x, = x,;i + y,k + 2,k and x2 = x2i+ yojf+ 22k. As direction vector 
one can take a = x2 — X,. If this expression is substituted for a in the point-direction equation, one 


obtains a two-point equation of — a 


Le 


538 24. Analytic geometry of space 


Example; To find a two-point equation of the line through P,(5, 2, —1) and P,(—3, —2, 0). 

From x, = 5i+ 2j— k and x, = —3— 2j it follows that x, — x, = —8i—4j/+ kh. A 
two-point equation is x = (Si + 27 — k) + t(—8i — 47 + &) or, by choosing a new parameter, 
u=—t, x = (Si + 2j — k) + u(8i + 4j — &). 


Basic geometric problems. Some formulae are now derived that help to solve the most important 
geometric problems. 


Angle between two lines. One says that two lines oriented by their direction vectors a and a* 
enclose an angle ¢ if the lines through the origin parallel to them and with the same orientation 
enclose this angle. From the definition of the inner product (scalar product), a - a* = |a| |a*| cos ¢. 
Two lines are perpendicular to one another if a: a* = 0. Since e = 


a/|a| and 
e* = a*/|a*|, one obtains cos yp = e+ e* [ cosp=e-e* | 


Example: Two oriented lines are ‘given by x = (2i — 3j/ + 4k) + «(34 — 47 + 12h) and 

x* = (i+ 5j — 3k) + ¢*(4i + 3k). What angle g < 2 do they enclose if their orientations cor- 

respond to the given direction vectors? — The given direction vectors must first be normalized. 

This gives e = (34 — 4j + 12&)/V/(9 + 16 + 144) = (1/13) (34 — 47 + 12k) 

and e* = (47 + 3k)//(16 + 9) = (1/5) (47 + 3k) and so 

cos m = [(1/(5 = 13)) (Gi — 47 + 12k) - (47 + 3K) = [1/5 x 13))(3 x 4—4 « 04 12 x 3) 
= 48/65 = 0.738... 

The angle enclosed by the given lines is therefore g = 42.4°. This is not to say that the lines 

intersect! 

Distance of a point from a line. If x = x, a te is the equation of a given line and (x2, y2, Z2) 
the coordinates of a given point P2, then OP, = x2 is determined, and d= |(x, — x,) X e| is 
the distance of the point P, from the given line, that is, the length of the perpendicular from P, 
to the given line (Fig.). In ve with the line, the distance i is regarded as a non-negative number, 

' in accordance with the elementary notion; the 
concept of oriented distance will be taken up in 
dealing with the plane. 


Proof. . Let g <2 be the angle enclosed by e 


and P,P». Then d = |P,P,| sing. On the other 
hand, from the definition of the vector product 


|PiP2 <x e| = |P,P2|-le| sing. Since |e] = 1, 
|P, Po X e| = |P,P2| sing = d. The result follows, 
24.2-5 Distance of a point from a line because P; P2 = x2 — x,. 


Example: To find the distance of the point P,(3, 1, 5) from the line 

x = (24 — 3j + 4k) + (¢/13) (Gi — 47 + 124). 

Since x, = 2 — 3 + 4k and x, = 3/+-j + Sk, it follows that xr, — x, = i+ 4j + k&. Hence 
(x. — x,) X e= (i+ 4j + k) & (1/13) (38 — 49 + 12k) = (1/13) (527 — 97 — 164). 

The magnitude of this vector is the required distance 
d = (1/13) (52? + 97 + 167) = (1/13) (2704 + 81 + 256) = (1/13) 3041 = 4.24. 

The point P, is at a distance of approximately 4.24 units from the given line. 

Distance between two skew lines. Two lines that have no point in common and are not parallel 

are called skew. If / and /* are two skew lines, there is always exactly one point Q on / and exactly 


one point Q* on /* such that the vector oo* is perpendicular to both lines (see Chapter 9.). 
The length of this vector is the shortest distance that any two points of / and /* can have from one 
another. It is called the distance between the two lines. 

The distance dcan be calculated from the equation 2 x =x, + taof/and the ee x*=x¥* 4 tat 


of [*. _ is a parameter f¢, such that 1+) ta= rye) and a parameter tT, such that x* + t,a* = OQ*. 
SinceQQ* is perpendicular to a and a*,QQ* = d- (a X a*)/|a Xx a*|. If this expression is substituted 


in 0O* = -+ QQ* and the scalar product of both sides with (a < a*) is taken, then the solution 
for d is the ae d between the skew lines / and /*. 


24.2. Linear spaces 539 


Intersection of two lines. In space two lines generally have no point in common. For if / and /* 
are two lines with the equations x = x(t) = x, + ta and x* = x*(t) = x¥ + ta* and they have 
(at least) one common point, then there must be (at least) one pair of values f¢, t such that x(t) = x*(t). 
To this vector equation there corresponds a system of three linear equations for the two unknowns, 
which is not generally soluble. 

The adjacent conditions are necessary and 
sufficient for the existence of a unique solution, [@xat+o and (x, —x#)-(@x a*)=0 | 
that is, for two lines in space to have exactly one 
point in common. 

The first condition says that the lines are not parallel and therefore cannot coincide, and the second 
follows immediately from the formula for the distance between two skew lines, since two inter- 
secting lines must have zero distance. If there exist two parameters ¢ and t as the unique solution 
of the system of equations, then these substituted into the equations of the lines give the point of 
intersection of | and /*. Otherwise there is either no point of intersection (no solution) or / and /* 
coincide (infinitely many solutions). 


A system of lines that pass through a fixed point is called a bundle of lines. If, in addition, the lines 
all lie in one plane, one speaks of a pencil of lines or, for oriented half-lines, a pencil of rays. 


Plane 


Equations of a plane. A plane can be fixed in space by three points that do not lie on a line or by 
two points and a direction vector not parallel to the line joining the points or by a point and two 
non-parallel direction vectors. 

Parametric representation. Suppose that a point P, with co- 
ordinates (x,,¥1,2;) and two nonparallel direction vectors a 


and a* are given. If the vector OP, is denoted by x,, then 
x* = x, + ta ends at a point P* on the line determined by x, 
and a. The vector x = x* + ta* ends at a point P in the plane 
determined by x,, @ and a*. Hence for any pair of parameters 
t,t a point of the plane is determined by x = x, + ta + ta*. 
Conversely, for any point P of the plane there are two num- 
bers ¢ and tv for which such a representation holds. This is a 
parametric representation of the plane (Fig.). 


. 24.2-6 Parametric representation 
[ Parametric representation of the plane | x= x, + sa + ta*] of a plane 


If two points P,; and P, and a direction vector a are given, then a* can be determined as the direc- 
tion vector of the line through P, and P,. If three points are given, then two lines can be drawn 
through them and the direction vectors a and a* can be calculated. In each case a parametric represen- 
tation of the given form can be obtained. 


540 24. Analytic geometry of space 


Ifx, = OP, , for example (O being the origin), then the parametric representation of the plane is 
x = (i+ k) + uli— fj) + vli— k). 


The parameters are denoted by w and v. To obtain a representation with normalized direction 

vectors one introduces the parameters t = u|i— j| = uw V2 and t = v|i — k| = v ¥2. In terms of 

f and r, 

x= (i+ k)+ (¢//2)@—J) + @/V2)G— &). 

General equation. The given vector equation, as parametric representation of the plane, represents 
the system of three linear equations written out in detail below, where a = de, a* = A*e*. If one 
multiplies the first by A = cos # cos y* — cos B* cosy, 2 Pr ‘ = 

the second by B= cosy cos «* — cos y* cos «, “1 i ; par B : ; i e ey : is 
the third by C = cos « cos B* — cos «* cos B gees ce sea ns 
and adds all three equations, one obtains 2=2, + ACOSY*! + A* Cosy*:T, 
Ax + By + Cz = Ax, + By; + Cz, or A(x — x,) + Bly — »1) + C@ — 2z,) = 0 as the equation 
of a plane through P,. If in the first form the terms on the right-hand side are written as a 
- —— constant —D, then the general equa- 
tion of a plane is obtained. 


x 


All points of space whose coordinates (x, y, z) satisfy an equation of this form, where A, B, C 
are not all zero, lie in a plane, and to each plane there is an equation of this kind which is satisfied 
by all points of the plane. More precisely, to any plane there are infinitely many such equations, 
since such an equation can be multiplied by any non-zero number without it representing a dif- 
ferent plane. Hence the actual values of A, B, C and D have no geometrical significance, but only 
their ratios. 

The intercept form of the equation of a plane is obtained from the general equation by bringing D 
to the right-hand side, dividing both sides by —D and putting a = — D/A, b = —D/B, c=—DI/C. 
This assumes that the numbers A, B, C, D are all non-zero; if not, an intercept equation is found 
by carrying out the corresponding operations as far as one can (see the following example). From 
the intercept equation it can be seen that the plane cuts off a segment a from the x-axis, b from the 
y-axis and c from the z-axis (Fig.). A plane passing through the origin has no intercept equation. 


_ Example: If x=(3i+ j—2k)+ (t/V3) G@— jf +k) + (7/3) i+ J—&) 
is the parametric representation of a plane, then the direction vectors a 
and a* have the direction cosines 


cosa = 1///3, cos a* = 1/)/3 . 
cos6=—-—1//3,  cosf* = 1/)3 
cosy = 1/3, cos y* = —]/y3. 
Hence 
A = (—1/y3) (—1/V3) — (1/y3) (1/y3) = 0 
8 = (1/¥3) (1/¥3) — (—1/V3) (1/y3) = 2/3 


= (1/73) (1/3) — (1/3) (=1///3) = 2/3. : 
The nae equation of the plane is therefore a 
x—3 2/3 — | 2/3)(z+2 
or O°” x + (ay O/3) Ls cas 4 ih gee oo aie equa- 
The intercept equation is therefore 
Ox—y—z=1 or y(—1)+2(-—1)=1. 
The plane cuts the x-axis ‘at infinity’, that is, it is parallel to the x-axis. 
The plane cuts the y-axis at y = —1 and the z-axis at z = —1. 


Hessian normal form. If one divides the general equation by //(A? + B? + C7) and puts 
A/V(A? + B? + C?) =n, B/Y(A? + B24 C2) =m, C/y (A? + B?+ C*)=n, an D/y(A? + B? 
+ C - = p, one obtains the Hessian normal form of the equation of a plane. 


By introducing the vectors x = xi 
+ yj + zk and n=n,i+ nj + n3k the 
Hessian normal form can be represented 
in a very simple way. 


24.2. Linear spaces 541 


The vector a is perpendicular to the plane and is called the normal vector of the plane. It is a 
unit vector. Starting from the general equation of the plane, the orientation of n is determined by 
the sign of /(A? + B? + C7). It is usual to take the positive square root. Then the side of the plane 
that lies in the direction of n is defined as the positive side and the other as the negative side, and 
one speaks of the positive and negative half-spaces. The plane is oriented so that on the positive 
side the anticlockwise sense is taken as positive. Just like the oriented distance between two points 
on an oriented line, the oriented distance of a point from an oriented plane is introduced and applied 
in what follows. The distance of the origin from the plane given by n- x = —p is p. If p > 0, then 
the origin lies in the positive half-space, and if p<0 it lies in the negative half-space. The figure 
gives an illustration of the Hessian normal form. The yellow surface represents an arbitrary plane E 
in space, and the red surface represents the plane through the origin O parallel to it. Let P be an 
arbitrary point of E, p the distance of E from O and n the normal vector of E. If g denotes the angle 
enclosed by OP = x and a, then from the definition of the inner product n- x = |n| |x| cos m and 
|n| = 1 it follows that n- x = |x| cosy = —|x| cos (180° — 9). 

From the right-angled triangle OPP’ it follows that |x| cos (180° — g) = p, that isa: x = —p. 


Example: To find the Hessian normal form of the plane given in the previous example. 

From A = 0, B = 2/3, C = 2/3 it follows that /(A* + B* 4+ C?) = (8/9) = (2/3) y2. If the 
general equation of the plane is divided by (2/3) 2, the Hessian normal form is obtained. It is 
O- x + (2/2) » + (2/2) z + 2/2 = 0 or, in vector form, (2/2) (j + &) > x = —Yy2/2. 


24.2-8 Illustration of the Hessian normal form 
of a plane in space 24.2-9 Distance of a point from a plane 


Basic geometrical problems. Some basic problems can be solved particularly elegantly by using 
the vector notation. 

Distance of a point from a plane. If n- x = —p is the Hessian normal form of a plane and 
Po(Xo, Yo, Zo) iS an arbitrary point of space, then d=n-x 9+ p:p is the distance of Po from the 
plane. This is easily seen from the figure. The yellow surface again represents the given plane E 
and the red surface represents the plane through O parallel to E. The grey plane is the plane through 
Po parallel to E. Just as m- x = —p is obtained for an arbitrary point P of the plane by means of 
the triangle OPP’, so if Po is an arbitrary point of space, by means of the triangle OP) P, one ob- 


tains m* Xo = —(p — d) and hence 
pom oe [_Distance of a point from a plane] d =n: xo +7] 


Example: To find the distance of the point P9(3, —1, 2) from the plane given in the previous 
example. The distance formula gives 
d = (2/2) (j + k)- (Bi — J + 2k) + (V2/2) = (V2/2) @ x 3-1 X 1+ 1 x 2) 4+ (V2/2) 
= V2 1.414, 
The required distance is approximately 1.414 units. 


The angle between two planes. If n- x = —p and n* - x = —p* are the Hessian normal forms of 
two planes, then the angle my between them is equal to the angle between their normal vectors rz 


and n*. Hence cos g = a- n*. In particular, the two planes are perpendicular 
to one another if and only if a: n* = 0. [ cosp = a-a* | 


Example; Are the planes 5x + 3y — z = 10 and 2x — »y + 7z= 5 perpendicular to one another? — 
The normal vectors of the planes are a = (1/35) (Si + 37 — &) and m* = (1/)/54) (28—j + 7k). 
Hence cos mp = (1/35) (1/754) (5 x 2— 3x 1 — 1x 7) =0, that is gp = 90°. The two given 
planes are perpendicular to one another. 


Intersection of two planes. Two planes always intersect in a line as long as they are not parallel. 
Hence |n- n*| << 1 orn X n* + 0 is a necessary and sufficient condition for two planes to intersect. 


542 24. Analytic geometry of space 


Two non-parallel planes always have a line in common. This is called the line of intersection. Since 
it is perpendicular to vn and a*, its direction vector can be expressed as a= a” X n*. If any point 
that satisfies both the equations n-x = —p and a*- x = —p* is determined, then the parametric 
representation of the line is obtained from this and a. In detail, a point is to be found whose co- 
ordinates satisfy the given system a equations, where 7,, 72,73 2 | ' 

are the components of m and n¥, n}, n¥ are the components of 
n*. This point together with the direction vector a gives the pot ee ‘ 
direction equation of the line of intersection. For example, if 23n¥ — nfn, ri 0, oien 


x = (p*nz — pn¥)/(yn¥ — n¥n2), y = (pnt — p*n,)/(ynt — n¥n2), z= 0 
is a solution of the above system of equations (Fig.). 


24.2-10 Two planes intersect in a line 24.2-11 Pencil of planes 


A family of planes that all contain one line is called a pencil of planes (Fig.). It is essential that 
nX n* + 0. 


To prove that the given equation represents a pencil of planes, one brings it to the form 
(n + An*)-x + (p+ Ap*) = 0. Firstly, it is obvious that the equation represents a plane for each 
value of A, so there are infinitely many planes. It must now be shown that all these planes have a 
common line. One first considers the planes given by values A, and A, of A(A, + A2). Since n X n* += 0, 
these planes are not parallel, for (# + A,n*) X (n + A,n*) = (m X n*) (A, — A,) + o. Hence they 
have a line of intersection with the direction vector a = n X n*. To set up a point-direction equation 
of the line of intersection a point must be determined that satisfies both the equations 
(n+ Ayn*)-x+(pt+ 4, p*) = = 0 and (n+ A2n*) - xt+(pt+ A2p*) = = 0. Since A, + A2, this is pos- 
sible if and only if the given system of equations is satisfied. It is clear that a and x can be determined 
independently of A, and A,. This means that the line of intersection of the 
planes given by A, and A, is common to all the planes, which are therefore 
expressible by the equation given at the beginning. — tee 

It should be noted that the equation of a pencil of planes, expressed in the given form, does not 
contain one of the planes through the line, namely the plane n* - x + p* = 0. This can be avoided 
by introducing a homogeneous parameter. Ifa = x*/x, then the equation can be written x(m- x + p) 
+ 2#*(n* 2 ae the plane n* - x + p* = Ois given by x = 0, x* = 1. 


= ) 
ze 


oC 4 ath mie : 


— 


intersection of hee oleae If three elnnes are given, then their 
equations n: x + p= 0, n*-x + p* = 0 and n**- x + p** = 0 form 
a system of three linear equations for the three components x, y, z of 
x. If this system is uniquely soluble, then the three planes have exactly 
One point in common. The adjacent condition is a necessary and suf- 
ficient condition for this. 

Otherwise the planes either have no common point or they have a common line. or they coincide. 
The first is the case when either two of the planes are parallel or the three planes taken in pairs 


24.3. Quadrics 543 


have distinct parallel lines of intersection. The second is the case when the three planes belong to 
a pencil. All the planes that have one point in common form a bundle of planes. 


By introducing homogeneous parameters (A, = x*/x, A. = x**/x one obtains 
u(n- x + p) + x*(n* + x + p*) + x**(n** x + p**) = 0. 


24.3. Quadrics 


Principal axes 


The set of all points whose rectangular coordinates satisfy an equation of the form F(x, y, z) = 0 
is, under certain conditions, called a surface. The condition might be, for example, that the function 
F(x, y, z) Should be continuous in all the variables. According to the conditions that are imposed, 
different concepts of a surface are obtained. 

If F(x, y, z) is a linear function of the three variables x, y, z, that is, of the form 4x+ By+ Cz+D, 
where the coefficients 4, B, C are not all zero, then the equation F(x, y, z) = 0 represents a plane. 

Henceforth, let F(x, y, z) be a quadratic function. Then F(x, y, z) = 0 is an algebraic equation 
of the second degree, that is, an equation of the form 
Gy 1x? + 2ayaxy + 2ay3xz + a22y? + 2ag3yz + a33Z7 + 2aygx + 2aggy + 2a34z + agg = 0. 

A surface representable by an equation of this form (where the first six coefficients must not all be 
zero) is called a quadric or a surface of the second order. 

Under a linear transformation of coordinates (translation, rotation or a combination of the two) 
an algebraic equation in the rectangular coordinates x, y, z with coefficients a,,; up to a44 goes 
into an algebraic equation of the second degree in the rectangular coordinates x*, y*, z* with coef- 
ficients a¥, up to a¥,. Of fundamental importance is the fact that in every case a rotation can be 
found so that a¥, = a¥, = a¥, = 0. This transformation is called a transformation to principal 
axes. 

Example: Under the rotation 
x = (V2/2)x* + (y2/2)¥*, y= —(V2/2)x* + (V2/2)y*, z= 2" 
the equation x* + y* + z* + xy — 1 = 0 goes into 
x*4/a* + y**/b* + z**/e?7 —1=0 with a=y2, b6=Yy(2/3), c=1. 
Thanks to the transformation to principal axes, the discussion of geometric figures representable 


by algebraic equations of the second degree can be reduced to the discussion of equations of the 
following form. 


However, such an equation in which the first three coefficients are not all zero, can in general be 
simplified still further by a transformation of coordinates (in fact, a translation). The kind of trans- 
lation needed in a particular case, and the form of simplified equation it leads to, depend on the 
nature of the coefficients. By considering all possible cases one arrives at the result that an arbitrary 
equation of the second degree can be reduced to one of 17 different special equations, each of which 
consists of at most 4 terms. 

Three of these equations have the form x?/a? + y?/b? + z?/c? + 1 = 0, x?/a? + y?/b? +1=0, 
x?/a* + 1 = 0, where a, b and c are non-zero. These have no real solution and therefore do not 
represent a geometrical figure. The other 14 equations represent 14 different geometrical figures. 
The following nine are degenerate or improper quadrics: 


_ x?/a? + y?/b? + z?/c? = 0, the point (0, 0, 0). 
_ x?/a? +. y?/b? = 0, a line, the z-axis. 
x?/a? = 0, a plane, the y, z-plane. 
. x?/a* = 1, the two planes parallel to the y, z-plane at distances x = +a. 
. x?/a* — y?/b? = 0, the two planes that cut the x, y-plane at right angles in the lines y= +(b/a) x. 
. x?/a? — y?/b? = 1, the surface of a cylinder that is cut by planes perpendicular to the z-axis 
in hyperbolas; these are parallel and congruent to the hyperbola x?/a? — y?/b? = 1 in the 
x, y-plane. 
7. x?/a* +. y?/b? = 1, the surface of a cylinder that is cut by planes perpendicular to the z-axis in 
ellipses; these are parallel and congruent to the ellipse x?/a* + y?/b? = 1 in the x, y-plane; if 
a = b the ellipses are circles. 
8. x? — 2py = 0, the surface of a cylinder that is cut by planes perpendicular to the z-axis in parabolas ; 
these are parallel and congruent to the parabola x? — 2py = 0 in the x, y-plane. 


Nnhwne 


544 24. Analytic geometry of space 


9. x?/a” + y?/b? — z?/c? = 0, the surface of a (double) cone that is cut by planes perpendicular 
to the z-axis in ellipses (or circles, if b? = a?). 


These figures are either not surfaces in the usual sense (1. and 2.), or they reduce to one plane 
(or two planes) and are really surfaces of the first order (3. to 5.), or they can be developed into a 
plane (6. to 9.). 


There remain finally just five geometric figures that are called proper quadrics. 


Proper quadrics 


Classification. After carrying out a transformation to principal axes the axes of the coordinate 
system are in the directions of the principal axes of the surface. The characterization of the surface 
depends on one distinguished principal axis (either such an axis exists, or it is immaterial which 
principal axis is taken) and a section of the surface by a plane perpendicular to it. 

According to whether the first section is an ellipse, parabola or hyperbola, the surface is called 
an ellipsoid, paraboloid or hyperboloid. 

The form of the second section, if it is necessary to make a distinction, determines the adjective 
elliptic or hyperbolic. The adjective parabolic is not used, since there is no proper quadric that has 
a parabolic transverse section. 

No distinction is made between elliptic and circular sections. An ellipsoid can therefore be a 
sphere, and an elliptic paraboloid can have circular transverse sections. For hyperboloids another 
distinction is essential. One distinguishes between hyperboloids of one sheet and two sheets. 

Altogether there are five proper quadrics, which are called ellipsoid, elliptic paraboloid, hyper- 
bolic paraboloid, hyperboloid of one sheet and hyperboloid of two sheets. 


Ellipsoid. In rectangular coordinates the 
simplest form of the equation of an ellipsoid 
is as follows. 


Here a, b, c are the lengths of half the prin- 
cipal axes of the ellipsoid (Fig.). If a=b = c, 
the ellipsoid is a sphere. If two of the numbers 
a, b, c are equal, it is an ellipsoid of revolution 
or two-axes ellipsoid and it is stretched (pro- 
late) if the two equal axes are shorter than 
the third and flattened (oblate) if they are 
longer. If the numbers a, b, c are all different, 
one speaks of a three-axes ellipsoid. 

The geometric figure represented by the  24.3-1 Three-axes ellipsoid 
above equation is a connected finite surface. 

It lies symmetrically about the three coordi- 

nate planes. Every plane section of the surface is an ellipse. Every segment through the origin 
joining two points of the surface (diameter) is bisected at the origin. Because of this property the 
origin is called the centre of the ellipsoid and the ellipsoid is a central surface. 

Any ellipsoid can be transformed into an ellipsoid of rotation by expanding or contracting the 
coordinates in a constant ratio in the direction of one coordinate axis (affine distortion), and conver- 
sely, any ellipsoid can be formed from an ellipsoid of rotation. If the coordinates in the directions 
of two coordinate axes are suitably altered in a constant ratio, a sphere is formed. 


Elliptic paraboloid. One of the three principal axes of ste elliptic paraboloid is echo 
In a rectangular coordinate system whose z-axis ea. S| | | 
is in the direction of the distinguished principal El rat 
axis, the simplest form of the equation is as 
follows. 

Here a and 5 are half the lengths of the principal axes of the ellipse cut by the plane parallel to 
the x, y-plane at a distance z = !/, (Fig.). Another way of putting it is to say that a? is the semi- 
parameter of the parabola cut on the elliptic paraboloid by the x, z-plane and similarly b* for the 
y, z-plane. If a = 5, the elliptic paraboloid is a paraboloid of revolution. 

The geometric figure represented by the above equation is a connected infinite surface in the 
positive half-space determined by the x, y-plane. It is symmetrical about the x, z- and y, z-planes. 
The section of the surface by every plane parallel to the z-axis is a parabola, and the section of the 


24.3. Quadrics 545 


surface by every plane perpendicular to the z-axis (and meeting it on the positive side) is an ellipse. 
The z-axis is called the axis for short, and the origin is called the vertex. The surface has no centre. 

By means of an affine distortion in the x- or y-direction the elliptic paraboloid can be transformed 
into a paraboloid of revolution, and conversely, any elliptic paraboloid can be formed from a parabo- 
loid of rotation. 


24.3-2 Elliptic paraboloid 24.3-3 Hyperbolic paraboloid 


Hyperbolic paraboloid. One of the principal axes of the hyperbolic paraboloid is distinguished. 


In a rectangular coordinate system whose 
z-axis iS in the direction of the distinguished 
principal axis, the simplest form of the equa- 
tion is as follows. 

Here a and Bb are half the lengths of the principal axes of the hyperbola cut by the plane parallel 
to the x, y-plane at a distance z = '/, (Fig.). Another way of putting it is to say that a? is the semi- 
parameter of the parabola cut by the x, z-plane and similarly —b? for the y, z-plane. 

The geometric figure represented by the above equation is a connected infinite surface that lies 
in each octant. It is symmetrical about the x, z- and y, z-planes. The section of the surface by every 
plane parallel to the z-axis is a parabola, and the section of the surface by every plane perpendicular 
to the z-axis that does not pass through the origin is a hyperbola; the vertices of each hyperbola 
lie on a line parallel to the x-axis if the plane of the hyperbola meets the z-axis on the positive side 
and on a line parallel to the y-axis if the plane of the hyperbola meets the z-axis on the negative 
side. The x, y-plane cuts the surface in a pair of lines x/a + y/b =0 and x/a — y/b = 0. The 
planes through each of these lines and the z-axis cut on planes perpendicular to the z-axis the asymp- 
totes of the corresponding hyperbola. The z-axis is called the axis for short, and the origin is called 
the vertex of the hyperbolic paraboloid. It is a saddle point. The surface has no centre. 

The hyperbolic paraboloid is the only proper quadric that can never be a surface of rotation 
and therefore cannot be transformed into such a surface by an affine distortion. This is due essentially 
to the fact that no plane cuts it in an ellipse. 

There are other interesting possible ways of generating a hyperbolic paraboloid; one of these 
consists of translating a parabola that is convex downwards along a parabola that is convex upwards. 
For this reason, the hyperbolic paraboloid counts as a translation surface. 

Finally, a hyperbolic paraboloid can be generated by families of lines. If one writes the above 
equation in the form (x/a + y/b) (x/a — y/b) = 2z and puts z/(x/a —.y/b) = u and 2/(x/a— y/b) =», 
then the following two pairs of equations can be derived from the equation of the hyperbolic para- 
boloid: 


1. (x/a+ y/b)=2u, (x/a+y/b)=0z, 2. (xe/a—ylb)=z/u, (x/a — y/b) = 2/0. 
Each of these equations represents a family of planes, and each pair of equations determines a 
family of lines. These families of lines lie on the hyperbolic paraboloid. The lines are the generators 
of the hyperbolic paraboloid (Fig. 24.3-4). 

Every surface generated by a family of lines is called a ruled surface. Among surfaces of the second 
order the elliptic, parabolic and hyperbolic cylinders, the (double) cone, the hyperbolic paraboloid 
and the hyperboloid of one sheet are ruled surfaces. 


546 24. Analytic geometry of space 


Since the cylinders and the cone can be developed into a plane, they are called developable sur- 
faces. The example of the hyperbolic paraboloid shows that not every ruled surface is developable. 
The hyperboloid of one sheet is not developable either. 


24.3-4 Hyperbolic paraboloid with its two families — 
of generators 24.3-5 Hyperboloid of one sheet 


The hyperboloid of one sheet. One of the three principal axes of a hyperboldid of one sheet is 
distinguished. In a rectangular coordinate system whose z-axis is in the direction of the distinguished 
principal axis, the simplest form of the equation is as follows: 


Here a and 5b are half the lengths of the 
principal axes of the ellipse cut by the x, 
y-plane. Similarly 6 and c are half the 


lengths of the principal axes of the hyper- 
bola cut by the y, z-plane (Fig.). If a = b, the hyperboloid of one sheet is a one-sheet hyperboloid 
of rotation. 

The geometric figure represented by the above equation is a connected infinite surface that lies 
in each octant. It lies symmetrically about the three coordinate planes. The section of the surface 
by every plane parallel to the z-axis is a hyperbola, and the section by any plane perpendicular to 
the z-axis is an ellipse. The z-axis is called the axis for short. 
The origin is the centre, hence the hyperboloid of one sheet 
is a central surface. 

By an affine distortion in the x- or y-direction the hyperboloid 
of one sheet can be transformed into a hyperboloid of revolu- 
tion, and conversely, it can be generated from such a surface. 

The hyperboloid of one sheet can also be generated by fami- 
lies of lines. If one writes the above equation in the form 
(x/a + z/c)(x/a — z/c) = (1 + y/b) (1 — y/b) and puts 
(1 — y/b)/(x/a—z/c) = u and (1 + y/b)/(x/a — z/c) = v, then 
two pairs of equations can be derived from the equation of 
the hyperboloid of one sheet: 
l.x/fat+2z/e=ull+y/b), x/a+z/e=v(1 — y/d), 
2.x/a—zile=(1—y/b)/u, x/a— z/e = (1 + y/b)/v. 


24.3-7 Model 
of the transmis- 
sion of rotations 
by means of two 
hyperboloids of 
one sheet 


24.3-6 Hyperboloid of one sheet with 
its two families of generators and 
asymptotic cone 


25.1. The basic elements of projective geometry 547 


Each of these equations represents a family of planes, and each pair of equations represents a family 
of lines. These are the generators of the hyperboloid of one sheet. The hyperboloid of one sheet 
is therefore a ruled surface (Fig.). Owing to this property, two hyperboloids of one sheet can be used 
in technology, like two cones, to transmit a rotation about one axis into a rotation about another 
arbitrarily directed axis (hyperbolic cog-wheels, Fig.). 

If the generators are translated so as to pass through the origin, they form the asymptotic cone of 
the hyperboloid of one sheet. Its equation is x?/a? + y?/b? — z?/c? = 0. 


The hyperboloid of two sheets. One of the three principal axes of the hyperboloid of two sheets 
is distinguished. In a rectangular coordinate system whose z-axis is in the direction of the distinguished 
principal axis, the simplest form of the equation is as follows: 


Here a and 5 are half the lengths of the principal axes of the 
ellipse cut on the hyperboloid of two sheets by the planes parallel 
to the x, y-plane at a distance z = +c //2. Also, c and a are half 
the lengths of the principal axes of the hyperbola cut by the x, 
z-plane, and similarly c and 5 for the y, z-plane (Fig.). If a = b 
the hyperboloid of two sheets is a two-sheeted hyperboloid of revo- 
lution. The geometric figure represented by the above equation is 
an infinite disconnected surface, consisting of two parts, with 
points in each octant. The two parts lie symmetrically about the 
coordinate planes. The section by any plane parallel to the z-axis 
is a hyperbola, and for |z| > c the section by any plane perpen- 
dicular to the z-axis is an ellipse. The z-axis is called the axis for 
short, and the origin is the centre of the hyperboloid of two 
sheets. It is therefore a central surface. By an affine distortion in 
the x- or y-direction the hyperboloid of two sheets can be trans- 
formed into a hyperboloid of revolution, and conversely, it can 
be generated from such a surface. As for the hyperboloid of ™™ . 
one sheet, there exists an asymptotic cone. 24.3-8 Hyperboloid of two sheets 


25. Projective geometry 


25.1. The basic elements of projective geo- 25.3. CroSssS-ratiOS ... 0... cece ce eee 549 
TNCULY a4 owes sceede den bore cosas 547 25.4. Projective mappings.............. 551 
25.2. Projective coordinates ............ B48. (259.9% CONICS exhcdnn a ees awe de ea ew tod 556 


Projective geometry investigates those properties of geometrical figures that are unaltered by 
projection. The impetus for these investigations was provided by the study of perspective in painting 
and architecture. Following the development of descriptive geometry, principally by Gaspard MONGE 
(1746-1818), Victor PONCELET (1788-1867) gave a first outline of projective geometry in his ‘ Traité 
des propriétés projectives des figures’. Analytical methods in projective geometry were introduced 
mainly by August Ferdinand MOsius (1790-1868) and Julius PLUCKER (1801-1868), while Jacob 
STEINER (1796-1863) and Christian von STAUDT (1798-1867) perfected a development of projective 
geometry without these methods. The first beginnings of this synthetic approach are to be found in 
the work of PAPPus (250-300? B. C.), who introduced the cross-ratio, referring to a lost work 
of APOLLONIUS of Perga (265-180 B. C.?). The connection between projective and Euclidean 
geometry was clarified by Felix KLEIN (1849-1925). He also introduced the idea of a geometry as 
the invariant theory of a certain group of mappings. 


25.1. The basic elements of projective geometry 


Improper elements. In general, a parallel projection between two lines of a plane maps all the 
points of one line one to one onto all the points of the other. In a central projection, or perspectivity, 
the correspondence between the points of two lines /; and /, is determined by means of lines p 


548 25. Projective geometry 


through one point, the centre C; for example, if P, is a 
point of /,, then the line p; = (CP,) cuts the line /, in P., 
the image of P, (Fig.). This mapping does not cover all the 
points of /; and /2: to the point P for which po = (CP) || I, 
there corresponds no point on /,, and to the point Q for which 
P = (CQ) || 4, there corresponds no point on /,. In order to 
preserve the one-to-one mapping, one adjoins to the proper 
points of the plane all the directions of lines of the plane 
as improper points; the image of the point P is then the di- 
25.1-1 Introduction of improper points Tection of /2, and the inverse image of Q is the direction of /,. 

One proper and one improper point determine exactly 

one proper line through the proper point in the 
direction of the improper point, for example, the line pg through P and its image. By contrast, 
two improper points determine the improper line, which consists of all the improper points. Two 
parallel lines cut in an improper point, which corresponds to the common direction of the two lines, 


while a proper line and the improper line cut in the improper point that corresponds to the direc- 
tion of the proper line. 


In contrast to the geometry of the Euclidean plane, two lines in the projective plane always cut 
in one point, and through two points there passes exactly one line. 


The projective plane consists of all proper and improper points. Its linear subspaces are the proper 
lines and the improper line. Since proper and improper points can be mapped into each other by a 
central projection, it makes no sense to distinguish between them. In the same way, the distinction 
between proper lines and the improper line loses its meaning, and it makes no sense to talk of parallel 
lines. 

Projective space can be obtained from Euclidean space in the same way, by adjoining to the proper 
points all the improper points, that is, directions of the lines in space. The set of these improper 
points forms the improper plane. The improper lines are cut on this plane by the proper planes, and 
therefore span the improper plane. 


In projective space two lines are either skew or cut in exactly one point. Two planes cut in exactly 
one line. A line and a point not lying on it span exactly one plane. 


If one takes the lines as the basic elements of the projective plane, instead of the points, then the 
points, as vertices of pencils of lines, are the linear subspaces. 


25.2. Projective coordinates 


If S is a point outside the projective plane JJ, then there is a one-to-one correspondence in which 
each line through S corresponds to its point of intersection P with the plane JJ (Fig.). If the line is 
represented by a non-zero vector x, then ex(@ + 0) gives the same point P of I7, which is thus charact- 
erized by a 1-dimensional vector space. Referred to a basis e9, e;, €2 of the vector space, x has 

2 2 


the representation x = e€g9Xo + €1X1 + e2X2 = D e,x, orex = D e,ox;. The three numbers (xo, x; , X2) 
i=0 i=0 


25.2-1 Introduction of 
projective coordinates 


25.3. Cross-ratios 549 


so determined by the point P are its homogeneous projective coordinates, homogeneous because 
(0X0, 0X1, 0X2) (0 + 0) also represent the same point. The base-points E; with x = e,; (i = 0, 1, 2) 
have the homogeneous coordinates (1,0,0), (0,1,0) and (0,0,1), and the unit point E with 
x =e=eo + e, + e, has the homogeneous coordinates (1, 1, 1). The base-points and unit-point 
form a basis for the projective coordinate system. it instead of the vectors e; and e, one chooses the 


vectors e; = 0,e; and e’ = ge, then e’ = ge=e@ z e, and also e’ = z e= s e,0;- It follows that 


0: =e and x; = ox}, that is, the homogeneous coordinates remain unaltered ft the ratios x9: x1: x2 
remain constant. 
If S’ is a point outside J7, distinct from S, with the vector x S{€; relative to S, then, relative to S’, 


the points E; and E have vectors of the form e; = Dy ay je; and e = - e,. The affine mapping which 
makes each point X, with the vector z x,e; relative ‘to S, correspond uniquely to the point X’, with 
the vector D (sj; + D> X;Q44) ey relative to S, and each vector x= z x,e, to the vector 


x = = 5 x xd, ’) ej, is “uniquely determined by the conditions that S goes into S’, and the vectors e; 
j= 


into e; vand e into e’. If then the vector p = D Pie; represents the point P, its image P’ is represented 
by p’ = = 5( py Pi 3) ej = D> p,e; and it therefore has, relative to the basis e;, the same coordinates 


as P relative to the basis « e,. The projective coordinates therefore depend only on the basis, and not 
on the points S, S’ of the surrounding space, which are only used to derive them. 

Conversely, four points of a plane, Eo, E,, E,, E, no three of which are collinear, determine a 
projective coordinate system, for which these points are the basis. One need only join one point S 
outside the plane by lines to the given points and so choose vectors e; and e on SE; and SE so that 


z e; =e. The vectors e; are linearly independent, since the points E; do not lie on one line, and 


sherefore they form a basis of the vector space. 

In space a projective coordinate system is correspondingly determined by a basis of five points 
Eo, E,, £2, E3, E, no four of which lie in one plane. From a point S outside the space, the base- 
points are represented by vectors e; (i = 0, i 2, 3), which form a basis of the 4-dimensional vector 


space, and the unit point by the vector e = x e,. The coordinate ratios x9 : x; : X2 : x3 do not depend 


ix=0 


on the choice of the point S nor the choice of the vectors e;, e on the corresponding lines through S. 
On a projective line, three distinct points suffice as a basis of 
a coordinate system, two as base-points and one as _ unit- E 
point. y, 
Given any coordinate system in the plane, determined by 
four points Eo, £,, E2, E, one can restrict it to the three coor- 
dinate axes Ey E,, EgE2, and £,E, by taking as unit-points Eo, 
on E9£, with coordinates (1, 1, 0) corresponding to the vector 
€o + €,, Eoz on EE? with coordinates (1, 0, 1) corresponding to 
€y + e2, and E,, on E,E, with coordinates (0, 1, 1) corres- Fc li5 
ponding to e, + e2 (Fig.). 02 
Conversely, any projective coordinate system on a line can 
be extended to a coordinate system of a plane containing the 
line by taking the base-points Ey and E, and unit-point Eo, on 
the line, choosing a point E, not on the line as third base-point, 
and choosing as a new unit point Ea point of the line Eo,E2 fy bon f, 
distinct from Eo; and E2. The original coordinate system On 45.9 Restriction of the coordi- 
the line is then a restriction of the coordinate system in the plane _ nate system of the plane onto any 
to the coordinate axis EF, . one of the three coordinate axes 


25.3. Cross-ratios 


Let A, B, C, D be four points of a projective line / (Fig.) such that no three of them coincide 
and that A and B are distinct. The line / and a point S not on / determine a plane, and with respect 
to a basis eg, e; With origin S the four points correspond to vectors a = do€g + a,€,, b= doey + 5, €,, 
C = Co€y + C1€,,d4 = doeo+d,e,. Since aand b are linearly independent, c and d can be expressed 
in terms of a and 5; one then has c = Aga + Lob (c; = Aga; + Uob;) and d=A,a+ u,b(d, =A,a, + 4,5;). 


550 25. Projective geometry 


One can now define the cross-ratio of the four points: 


(A, B; C, D) = (Ho/Ao) : (41/1) 


The cross-ratio does not depend on the choice of the vec- 
tors a, b, c, d nor on the choice of S. The first part of this 
statement follows from a calculation of the cross-ratio for 
the new vectors a’= «aa, b’ = Bb, c’ = ye = y(Aga + Mod) 
= y[(Ao/a) a’ + (uo/B) 6’) and = d’ = dd = O(A,a + 11,5) 
= 6[(,/«) a’ + un /B) 8’) This gives 4g = (y/x) Ao, Ho 
= (y/B) Mo, A, = (O/a)A,, my = (O/B), or (A, B; C, D) 
= (Ho/Ao) : (41/44) = o/Ao) : (41/41). If, on the other hand, 
S’ is a Point not on / and distinct from S, and if the vect- 
ors a’, b’, c’,d’, with respect to S’, correspond to the points 
A, B, C, D, then, as was shown for the projective plane, there 
is an affine mapping connecting the plane determined by / 
and S with the plane determined by / and S’, which takes S 25.3-1 Invariance of the cross-ratio 
into S’ and the vectors a, b into a’, b’. Because of the linearity under projection 
of the mapping c’ = Apa’ + lob” and d’ = j,a’+ 4,6’, and 
so with respect to S’ one has (4, B; C, D) = (uo/Ao) : (M4 /A4). 


The cross-ratio remains invariant under central projection. 


The invariance of the cross-ratio follows from the fact that the vectors a, b, c, d, relative to S, 
represent not only the points A, B, C, D as points of intersection of the four lines with the line i, 
but also the points of intersection A’, B’, C’, D’ with the line /’. 


Representation of the cross-ratio in projective coordinates. In an arbitrary coordinate system on 
the line /, let a;, b;, c;, d; (i = 0, 1) be the coordinates of the points A, 3, » C: D, where c; = A904; + Uob; 


bo 
and d, = A,a; + 44,b;. Since A+ B, the determinant A = |a,b,| = b = dob; — a,bo is non- 
ay 1 
zero, so that from the system of equations for c; and d, the real numbers A; and y; can be calculated: 
1 [50 Co 1/2 Co 
Ag =-— = —}b,c,|/\a,b;\, = — = |a,c,|/|a,b;\, 
7) A br [bie;|/|a,b;| Ho A |a;c,|/|a,b,| 
1 [bo d, 1 la 
A= - OT = —|bidi|/laibi|, aa = A ° 1 = |a,d;|/|a,b,|. 
1 
Hence 
(A, BC, p= 5; Hy _ lacil . lad _ Fla,c) , F(a, 4) 


Ay — [bye,| ; |b,d,| F(b, c) ; F(6, d) ; 
Here F(x, y) dencres the area of the parallelogram determined by 
the vectors x and y. This relation clarifies the connection with 
the usual definition of the cross-ratio of four proper points on 
an affine line /. By introducing a suitable factor 9 one can arrange 
that A, B, C, D are the end-points of vectors a, b, c,d, drawn 
from S. The ratios of the areas of the parallelograms are those of 
triangles with the same height A (Fig.), that is, the ratios of 
the lengths along the line /. Treating these as directed segments, 
one has 


> —> —_> —> —> —> — —- 
N (A, B; C, D) = (AC/BC) : (AD/BD) = (AC/CB) : (AD/DB), 
(that is, the quotient of the two ratios 
\ (AC/CB) and (AD/DB). 


25.3-2 The definition of cross- Special me . f f th : a 

. pecial positions of the four points. If two of the points coin 

ratio on an affine line, (A, B;C,D) cide then the determinants, and therefore the cross-ratio, take 

= (AC/CB): (AD/DB) special values; for example, |a,d,;| = 0 if A = D, |b,d,| = 0 if 
B= D, |b,c;| = 0 if C = B and |a,c,| = 0 if A = C. 


For harmonic points one has (A, B; C, D) = —1. The ratios of areas F(a,c): F(6,c) and 
F(a, d): F(6, da) then have opposite signs. From the definition of the vector product it follows 
that for either sense of rotation in the plane determined by / and S only one of the vectors c or 
d can lie between a and Bb. One says that the pairs of points A, B and C, D separate each other. 


25.4. Projective mappings 551 


There are 24 permutations of four points. If the two pairs are interchanged, or if the two points 
in each pair are interchanged, then the cross-ratio is unaltered: 


Hence the 24 possible permutations give six values of the cross-ratio. In calculating these cross- 
ratios one must go back to the definition of a determinant; for example, 
|a;b;| : |e;d,| = |e;b;| ° |a;d,| = (doby — a1bq) (Cod, — ¢1do) — (Cob1 — €1b9) (od; — a1do) 
= (AoC, — 41€0) (bod, — bdo) = la;e;| : |b;d;|. 
It follows immediately from the definition that if the points of the second pair are interchanged, 
one obtains the reciprocal of the cross-ratio. If two points of different pairs are interchanged, one 
obtains, for example, 


}a;b;| * |eid,| _ |eibi| * |aidy| + |aibi| + |eids! — |eibi| + |aid,| 


(A, Cc; B, D) = |c,b;| ° |a,d;| _ \c,bi| . la;d,| 
la;e;| + |bidj| ; 
a ~———. = 1 — k, since |c,b,| = —|b,c;. 
7 |e:bi| * |aid,| |c,b,| |bic;| 


Similarly, (A, D; B, C) = 1 — 1/k = (k — 1)/k. 

The cross-ratio of four lines of a pencil in a projective plane is defined as the cross-ratio of the four 
points of intersection of these lines with an arbitrary line /. The fact that this is independent of the 
choice of / follows from the invariance of the cross-ratio under central projection (Fig.). 


25.3-4 Introduction of 
projective coordinates by 
means of cross-ratio 


_ 25.3-3 The cross-ratio of four lines is independent of the choice 
of the line / 


Introduction of projective coordinates by means of cross-ratios. If with respect to a basis E; 
(i= 0,1, 2) and E a point P has the projective coordinates (xo, x;, x2), and if one projects the 
points E and P from E), onto the coordinate axis E,E, (Fig.), then Eg; and Po, have the coordinates 
(1, 1) and (xo, x1) in the coordinate system restricted to the line Eg E, . One then has (Eo, E;; Eo1, Po1) 
= X,:Xo. Similarly, (Eo, E2; Eo2, Po2) = X2: Xo for the projections of E and P onto the line E)E). 
One can therefore define the coordinate ratios x9 : x; : x2 by means of cross-ratios, that is, by projec- 
tive quantities only. In the special case when E, and E>, are improper points and the lines E, P and 
E,P are parallel to the coordinate axes E,FE, and EjE>,, the coordinate ratios x; : xo and x2: Xo 
represent affine parallel coordinates. 

Given three points of a line, one can always choose a fourth point of the line so that the cross- 
ratio of these four points takes a given real value A; for one can treat the three given points as the 
basis Ey, E,, Eo, of a coordinate system on the line, and then the fourth point is determined by 


(Eo, £1; Eo1, Por) = X1:%X0 = 4. 


25.4. Projective mappings 


Under a central projection A the projective coordinates of the image point P are the same as 
those of the original point, referred to the images E£;, E of the basis E;, E. With respect to an arbitrary 
basis E;, E’, with corresponding basis vectors e;, e’ connected to the hasis vectors @; by equations 


552 25. Projective geometry 


-2 a, je, with det (a,;) = 0, the point P has the coordinates x; = = a, j;x;. Hence a central projec- 
tion is described with respect to an apie basis by means of a linear coordinate transformation A 
with non-singular matrix (a,;): ex; = Ds a; Poe If A is followed by another central projection B 
with the coordinate transformation B: (ee = z b, jx; with det (6,;) + 0, then the resultant BA 
of ue two central projections is described by the coordinate transformation C: ox,’ = Py Pui Ps Qy jX; 
= z C,jX; With det (c,;)=+- 0. As a generalization of the central projection one eqns a praieeiibe 
mappiie as a One-to-one mapping of a projective plane onto itself or another plane, described by 
a regular linear coordinate transformation ox; = 2 a; jx; (i = 0, 1, 2) with det (a;;) + 0. It can be 
shown that any projective mapping arises from finitely many successive central projections. 


Main theorem of projective geometry. There is exactly one projective mapping of a plane // onto 
a plane /7‘ that takes four given points of J7 no three of which are collinear into four points of JI’ 
no three of which are collinear. 


To prove this one takes as bases in JJ and JI’ the four given points, so that in the corresponding 
coordinate systems the points of each of the two sets have the projective coordinates (1, 0, 0), 
(0, 1,0), (0, 0, 1) and (1,1, 1). If one substitutes these coordinates for the four points and their 


images in ex; = 2 a; ;x;, it follows that a;; = 0 for i= j and a,;;= 1 for i= j, that is, ex; = x;. 


Conversely, the mappiae described by the coordinate transformation ox; = Xj takes the four points 
of the basis in J/ into those in J/7’. It is therefore the unique projective mapping of this kind. 

A central projection of JZ onto II’ maps each point on the line of intersection s = (7 ~J7’) onto 
itself. If an arbitrary projective mapping has this property, then it must be a central projection. 
For if /’ inJI’ is the image of a line / in //, then /and /’ meet in the same point of s, since by hypothesis 
the point of intersection of / with s remains fixed, and the lines / and /’ therefore determine a plane. 
Then if P’ and Q’ on /’ are the images of the points P and Q on /, the point of intersection C of the 
lines PP’ and QQ’ is the centre of a central projection, which takes P into P’,Q into Q’ and each 
point of s into itself. According to the main theorem, this central projection is identical to the given 
projective mapping. 

According to Felix KLEIN’s Erlangen Programme (1872), projective geometry is the study of 
properties that remain invariant under projective mappings. 


If the cross-ratio (A, B; C, D) of four points A, B, C, D on a line / with c; = Aga; + Mob; and 
2 
d; = A,a; + mb; is determined by (M/A) : (41/41), then under the projective mapping ox; = 2 Q; jX; 


(i= 0, 1, 2) with det(a; p= the images A’, B’, C’, D’ patty the relations oa; = = 3 ae, 
eb; = oy a; jb; and ec; = ps Gj jCy = O(A0a; + Mob;), ed; = 2 Qj jd; = Q(A,a; + 5)), from which 


one obtains (A’, BY; C’, D’ = = (Ho /Ao) : (41/41) = (Uo/Ao) : Gn = (A, B; C, D). 

All the projective mappings form a group, which is characterized by the invariance of the cross- 
ratio. Those projective mappings that leave a line fixed, but not necessarily pointwise, form a sub- 
group. The group of affine mappings is a subgroup for which this fixed line is the improper line of 
the plane. This, in turn, has a subgroup of similarity mappings, which take orthogonal lines into 
orthogonal lines. The subgroup of congruence mappings also leaves distances between two points 
invariant. 

On a projective line three basis points determine the projective coordinates, and a projective 
mapping is described by a linear transformation 0x9 = axo9 + bx,, 0x; = cXo + dx, withad— bc+0. 


Main theorem on projective mappings of a line. There is exactly one projective mapping between 
two lines that takes three distinct points of the original line into three distinct points of the image 
line. 


Central projections are then projective mappings of lines whose point of intersection is a fixed 
point. 


The equation of a line. A point P of a projective line is characterized relative to two points A and 
B of the line with corresponding vectors a and b by the ratio f, : t2 of the parameters in the expression 
ox = t,a + ft2b. In homogeneous coordinates (x9, x;, x2) one therefore has the following system 


25.4. Projective mappings 553 


of homogeneous equations, from which the values of t,, t2, @ can be found, to within a non-zero 
factor A. If one assumes, without loss of generality, that x2 + 0, one obtains t, = A(xob, — x,bo), 
te = A(xX1d9 — X0Q,;) and oe = (A/x2) {(%0b1 — X1b0) a2 + (%1d0 — X0a;) b2}. If one substitutes 
these values in the first equation, one obtains x9Uo + x,u, + x2U2 = 0, where up = a2b, — a,bp, 
uy = aAgb2 = azbo and u2 = a;bo = Aob . — 
25.4-1 The connection 1149 + tabo — exo = 0 
between two pencils fay + fb; — ex, = 0 
1142 + f2b2 — ox2 = 0 


The Pliicker line coordinates (up, U,, U2), like the point coordinates (xo, x1, X2), are determined 
only to within a non-zero scalar 4. The formal equivalence of the two triples in the equation of a line 
makes it clear that for a given triple (ug, u,, u,) the equation determines a line / as the locus of 
all points whose coordinate triple (x9, x; , x2) Satisfies the equation, while for a given triple (xo, x1, X2) 
the equation determines a point P as the vertex of a pencil of lines whose triple (ug, u,, u2) 
satisfies the equation. The system of homogeneous linear equations x9Uo + x,u, + x2u2 = 0, 
XoVp + X10, + X2v2 = 0 determines the point (x9, x;, x2) common to the two lines (uo, u,, U2) and 
(v9, U1, V2), their point of intersection. Similarly, the equations xguo + x,;u; + x2u2 = 0 and 
Yolo + Yiu, + y2u2 =O determine the line (up, u,, u2) common to the two pencils with vertex 
at (Xo, X1, X2) and at (yo, 1, ¥2), that is, the line joining their vertices (Fig.). 


Principle of duality. The pairs of concepts ‘point of a line’ and ‘line of a pencil’, ‘join’ and 
‘intersect’, ‘lie on’ and ‘pass through’ can be interchanged, since they are represented by equivalent 
algebraic operations or equations. There is, in this sense, a principle of duality: true statements of 
projective geometry are transformed by the interchange into true statements; for example, the 
theorem ‘two distinct points lie on exactly one line’ goes over into the true theorem ‘two distinct 
lines pass through exactly one point’. To every theorem of projective geometry there is, therefore, 
a dual form, whose proof follows from that of the original theorem. 


Theorem of Desargues. If the lines joining corresponding vertices of two triangles pass through 
a point S, then the points of intersection of corresponding sides lie on a line s. | 

Dual form: If the points of intersection of corresponding sides of two triangles lie on a line s, 
then the lines joining corresponding vertices pass through a point 5. 


The dual form of the theorem of Desargues is also its converse. For a proof of the theorem 
itself one assumes that the triangles are A,B,C, and A,B,C,, where the lines 4A,A,2, B,B2 and 
C,C, meet in the point S (Fig.). It is required to prove that the points A = (B,C, ~ B2C>), 
B= (C\A, ~ C2A2) and C = (A,B, ~ A2B2) lie on a line s. One can apply a projective mapping 
to the two triangles in such a way that A and B go over into two improper points. Then in the image 
the sides B, C; and B,C; are parallel, and so are C, A; and C3,A,. By the intercept theorem it follows 
that A,B; and A,B, are parallel, that is, C’ also lies on the improper line, and so in the original 
figure C lies on the line s = AB. 


25.4-2 The theorem of Desargues 


554 25. Projective geometry 


Theorem of Pappus. Given two lines /, and /,, let 4, , B,, C, be three points on /, and A», B), C; 
three points on /,, distinct from the point of intersection O of the two lines. Then the points of inter 
ae 1m B= (C, A, — C,A,), Co (A,B, AzB,) of their cross-joins lie 
on a line . 


yf af 5 
-, 2 


25.4-3 The theorem 
of Pappus 25.4-4 Dualizing Pappus’ theorem 


Dual form: Given two points Ly and L,, let ay, 5,, c) be three lines through Ey and a;, 6, ey 
three lines through L,, distinct from the line o joining the two points. Then the lines a = [(b, - c,), 
(62 ~ €;)), 6 = [(e, ~ a2), (ez ~ @,)), ¢ = [(@; ~ 62), (az ~ 5;)] meet in a point L. 


For a proof of the theorem of PAPPus one introduces the points of intersection D = A,B, ~ A2C, 
and E = B,C, ~ A,C, (Fig). If one projects the line A,B, from A, onto the line /, , the ordered 
triple (C, D, Bz) goes into the triple (B,,C,, 0), while the point A, remains fixed. A similar 
central projection of /; onto the line B,C, from C, takes the ordered triple (A,, B,, O) into the 
triple (E, A, Bz) and leaves the point C, fixed. The two central projections together take the 
ordered quadruple (4,, C, D, B,) into the quadruple (£, A, C,, B2), and, since the point B, is 
fixed, this represents a central projection, whose centre B is the point of intersection of the lines 
C,A, and C,A,. But then CA must also pass through B, that is, A, B and C lie on a line. 


Complete quadrangle and complete quadrilateral. A complete quadrangle consists of four vertices 
A, B, C, D no three of which are collinear and the six lines joining them in pairs, the six sides AB, CD; 
AD, BC; AC, BD. The three points of intersection P= (AB CD),Q=(AD-BC) and R=(AC~ BD) 


of these six lines are called the diagonal points and are the vertices of the diagonal triangle PQR 
(Fig.). 


25.4-5 The complete V 
quadrangle 


25.4-6 The complete 
quadrilateral 


A complete quadrilateral, as the dual of a complete quadrangle, consists of four lines a, b, c, d 
no three of which are concurrent and their six points of intersection, the vertices (a ~ 5), (c ~d); 
(and), (boc); (anc), (6d). The three lines p = [(2~ 5), (c ~d)], g = [(a ~@), (6 ~0)] and 
r= [(a~c), (6~4)] joining these six points are called the diagonal lines and form the diagonal 
triangle pqr (Fig.). 


On any side of a complete quadrangle, the two vertices are harmonically separated by the diagonal 
point and the point of intersection with the line joining the other diagonal points. 

Dual form: At any vertex of a complete quadrilateral, the two sides are harmonically separated 
by the diagonal and the line joining the vertex to the point of intersection of the other . 


25.4. Projective mappings 555 


If, for example, one picks out the side 4B=u of the complete quadrangle (4 Fig. 25.4-5), the 
vertices A and B are harmonically separated by the diagonal point P and the point of intersection T 
of this side with the line QR joining the other diagonal points. Correspondingly, in the dual figure, 
the sides a and 5b that meet at the vertex U are harmonically separated by the diagonal p and the 
line ¢ joining U to the point of intersection (¢ ~ r) of the other two diagonals. 

The proof of the theorem on the complete quadrangle depends on the fact that any point X of a 
projective plane can be characterized by the vector x drawn to it from a point S outside the plane. 
The vectors a, b, c, d corresponding to the vertices 4, B, C, D are linearly dependent, but any three 
of them are linearly independent, since no three of the points are collinear. Therefore there exist 
four real non-zero numbers «, f, y, 6 such that «a + Bb + ye + 6d = o. The diagonal point P on 
the sides AB and CD can be characterized by a vector p = aa + Bb = —(yc + 6d), which is a 
linear combination of the vectors a and 5, and also of the vectors c and d. Similarly, the other 
diagonal points Q and R are characterized by the vectors gq = «a+ 6d = —(Bb+ yc) and 
r=aa-+ ye = —(6b-+ 6d). If T is the point of intersection of QR with AB, then the corresponding 
vector ¢ is given by t = q + r = «a — Bb. The cross-ratio of the four points A, B, P, T is then given 
by (A, B; P, T) = [x/B): [«/(—B)] = —1. 

By means of the complete quadrangle one can 
construct, using a ruler only, the fourth harmonic 
point 7 of a point pair A, B and a third point P on 
a line u (Fig.). If one chooses on a line through P 
the remaining vertices C and D of a complete 
quadrangle, distinct from each other and from P, 
then the diagonal pointsQ and R are determined 
by Q=(AD- BC) and R=(AC~ BD). The 
diagonal QR then cuts the side AB in the required 
fourth harmonic point 7. 


Duality in space. To the equation of a line 


2 
>» X,;4,=0 with Plucker coordinates u,; in the 
i=0 


plane there corresponds in space the equation 


FF, 
i 


3 
= 0 of a plane. The pairs of dual concepts 
aie ri . 7 25.4-7 Construction of the fourth harmonic point 


in space are therefore ‘point’ and ‘plane’, and a 
line through two points in space corresponds to 
the line common to two planes, so that the concept ‘line’ is self-dual. The theorem ‘A point and 
a line not through it determine exactly one plane’ has the dual form ‘A plane and a line not in 
it determine exactly one point’. 

In n-dimensional space the dual of a point is a hyperplane, and generally the dual of an m-dimen- 
sional subspace is a (n — m — 1)-dimensional subspace. 


Collineations. Any projective mapping, as a one-to-one correspondence between the points of 
two projective planes, can be regarded as a one-to-one correspondence between the lines of the 
planes, if the lines, as duals of the points, are regarded as the basic elements of the plane. In this 
interpretation, a projective mapping is described relative to the coordinates u, and u; by a linear 


2 
transformation of the form eu; = D' x, ju, (i = 0, 1, 2) with det («,;) + 0. 
i=0 


Dual form of the main theorem on projective mappings. There is exactly one projective mapping 
of the lines of a plane /7 onto the lines of a plane /7’ that takes four given lines of /7 no three of which 
are concurrent into four given lines of /7’ no three of which are concurrent. 


The cross-ratio of four lines of a pencil is an invariant of a projective mapping. 
Since projective mappings take collinear points into collinear points and concurrent lines, that is, 
lines of a pencil, into concurrent lines, they are called collineations. 


Dual form of the main theorem on projective mappings of a line. There is exactly one collineation 
of a pencil with vertex P into a pencil with vertex P’ that maps three distinct lines of the first pencil 
into three distinct lines of the second. 


Under a central projection the correspondence between the points of two lines / and /’ is by means 
of lines through the centre C, and the point of intersection F = (/ - I’) is a fixed poiyt of the mapping. 
Conversely, any collineation of two lines having a fixed point is a central projection or central 
collineation. Dually, the correspondence between the lines of two pencils L and L’ under a central 
projection is by means of points on a fixed line c; for example, if p is a line of the pencil L, its image 
p’ is the line joining the point P = (p ~ c) to the vertex of the pencil L’. The line f = (L, L’) is then 


556 25. Projective geometry 


a fixed line of the mapping (Fig.). Conversely, any collineation between two pencils having a fixed 
line is a central collineation. 

Since the cross-ratio of four lines p, (i = 1, 2, 3, 4) of a pencil L is equal to that of their points 
of intersection P; = (p, ~/) with an arbitrary line / that does not pass through L, and since col- 
lineations are characterized by the invariance of the cross-ratio, for any collineation A between two 
pencils L and L’ there is also a collineation A* between two lines / and /’ that do not belong to the 
pencils L’ and L, respectively. Any point P of / is the point of intersection of / with some line p of 
the pencil L, and then A* maps P into the point P’ = (p’ - 1’) of the line /’. In the same way, for 
any collineation A* between two lines / and /’ one can construct a collineation A between two pencils 
L and L’ in which any line p = PL corresponds to the line p’ = P’L’. 


25.4-8 Central projection 


25.4-9 Central collineation of two pencils with vertices L and L’ 


The collineation 4 so obtained is a central collineation if the vertices L and L’ of the two pencils 
lie on the line f joining two points that correspond in the collineation A*, because this line is then a 
fixed line of A, for example, if L and L’ are corresponding points of / and /’ with respect to A (Fig.). 

Similarly, the collineation A* between two lines / and /’ that arises from a collineation 2 between 
two pencils ZL and L’ is central if / and /’ meet in the point of intersection F = (p ~ p’) of two lines 
p and p’ that correspond under A, because this point F is then a fixed point of A* (Fig.). 

If p;, P2 and p;, pz are two further pairs of lines that correspond under A, and if they cut the lines 
1 and /’, respectively, in the points P,, P, and P;, P;, then the vertex C = (s, ~ 52) of the central 
collineation is determined as the point of intersection of the two lines s, = P,P; and s, = P,P; 
(Fig.). 

A correlation maps the points of a projective 
plane IZ one to one onto the lines of a plane JT’ 
and the lines of JZ onto the points of ZZ’. In point 
and line coordinates, correlations are given by 
linear transformations 


2 
ou; = Si ayx; (i=0,1,2) with det (a,,;)+0, 
j=0 


2 
ox, = LD oyu; (i=0,1,2) with det (6,,;) +0. 
j=0 


A correlation takes the point of intersection of two 
lines /,, /, into the line joining the points L, , L>, 
the images of the lines /,, /,, and the line joining 
two points P,, P, into the point of intersection 
of the lines p,, p2, the images of the points 
Piss Pre 


25.4-10 Central collineation of two lines / andl’ 


25.5. Conics 


The equation of a conic. An ellipse, hyperbola, or parabola can be regarded as the intersection 
of a cone with a suitable projective plane (see Chapter 13. — Conics). Any two of these conics are 
mapped into each other by projection from the vertex of the cone. Therefore, in projective geometry 
there is no longer any distinction between these conics. 


25.5. Conics 557 


In homogeneous projective coordinates the equation of a conic is 


Ago} + a1 4X? + 422X3 + 2dg1XoX1 + 2do2XoX2 + 2a12x1xX2 = 0 
2 
Or a Qj jXjXj = 0 with Qij = Qj; and det (a; ;) + QO. 
i, j=0 


By a suitable linear transformation a non-degenerate conic can be reduced to the form 
x§ — x3 —x2=0.A quadratic form with non-vanishing determinant does not necessarily represent 
a conic. The equation x% + x? + x3 = ci of the empty conic cannot represent a real conic under 


any transformation. A quadratic form > a; jX;X; with det (a; ;) = 0 represents a degenerate or singular 
i, j=0 
conic, which can be a pair of lines, a point, or a line counted twice. 


Polarity. For any regular conic y a; ;X;X; = 0 with det (a,;) + O there is a special correlation 

-% a; jx; (i= 0, 1, 2) of the signe onto itself, a so-called polarity. With any point P as pole it 
seen ates a line p as polar. The coordinates y; of a point Q of the polar of P then satisfy the equation 
: Vil; = Sa jVixj = 0, where x, are the coordinates of the pole P. All points Q of the polar of P 


are conjugate to P. The equation of the polar of a point P of the conic is, however, that of the tangent 
at P. 


Any point P of the conic is therefore conjugate to every point of the tangent ¢ at P, and therefore 
conjugate to itself. The points of the conic are the only points of the plane that are conjugate to 
themselves in this way. 

If the tangents ¢,, ft. at two points B,, B, of the conic meet at a point P, this point is conjugate 
to B, and to B,, and therefore, since all the points conjugate to P lie on a line, it is conjugate to all 
points of the line B,B,, which is thus the polar p of P (Fig.). 


25.5-1 Pole P and polar 
p when there are real 
tangents from P to the 
conic 


25.5-2 Pole P and polar p when there are no real tangents 
from P to the conic 


This conclusion makes it possible to construct the polar p of any point P, even if no tangents 
to the conic pass through P. Let p,; , p2 be two lines through P that cut the conic in B,, B, and B;, 
B,, respectively. Then the point of intersection P; of the tangents at B, and B, and the point of 
intersection P, of the tangents at B,; and B, are conjugate to any points of the lines p, and p2, 
respectively, and therefore to their point of intersection P. Hence the line p = P,P, : the polar 
of the point P (Fig.). 

The line / through the two points P and Q with coordinates x; and y, cuts the comic PD Qj jX;Xj =O 


at the points R, R’ with the coordinates r; = x,t; + yit2, which are determined By the equation 
2 


2 

Be Aistiry = = BS Gy (Xit1 aA Vitz) (xjty + yjte) = t2o + 138 + 2t;y =0, where oa == Qj jXiXj, 

i, J= 

p= 2 a; oy and y= > a; j;X;y;- The two solutions ft, :f2 and t;:f3 of this quadratic equation 
=0 

differ only in sign if y = ‘0, that is, if P and Q are conjugate. The four points P,Q, R, R’ are then 

harmonic, since (P,Q; R, R’) = (t>/t,): (t3/t;) = —1. 


The polar p of a point P is the locus of points Q such that the points of intersection R and R’ of 
PQ with the conic separate P,Q harmonically (Fig.). 


558 25. Projective geometry 


Since conics, as sets of points, are curves of the second order, their duals a curves of the second 
class, regarded as envelopes of their tangents. If P is 2 point of the conic Za jxiXy = 0, then the 
tangent at P (as the polar) has the coordinates u; = = a; 4x;. Conversely, an the coordinates u, 
of the polar p one obtains the coordinates of P as = — = 2 b, ju; (i= 0, 1, 2); here the matrix (5,,;) 
is the inverse of the matrix (a; ;), because it describes the i eiverce mapping. If one puts these coordinates 
into the equation of a curve of the second order, one obtains the cavenon. by 5; ju;u; = O of a curve 


of the second class. 


25.5-4 Projective 
generation of the conic 


25.5-3 Harmonic relation of two conjugate points and the 
points of intersection of the line joining them with the conic 


2 
Projective generation of a conic. The equation » a,,;x,x; = 0 of a conic takes a simple form if 
i,7=0 


two points A and B of the conic and the point of intersection C of the tangents at A and B are taken 
as base-points of a system of coordinates, and therefore have the coordinates A(1, 0, 0), B(O, 1, 0) 
and C(0, 0, 1). If one puts these coordinates a; of A and 5; of B into the equation of “ conic, one 


obtains 299 = Oand a,, = 0. Since Cis conjugate to A and B, one has z a jaicy = Gane Dh a; ;b;c; = 0. 


Putting in the values of a,;, b; and c;, one obtains 402 = = a;2 = 0. In ‘this coordinate system the equa- 
tion of the conic takes the form 2d9;xox, + a22x? = 0, that ist, (2x9/a22) - ao,x; + x3 =0, or 
€o&, + €3 = 0, where (2x9/a22) = £0, 4o1X%, = &, and x2 = &2, and thereby the unit point is 
prescribed. This equation €2/E9 + &,/£2 = 0 can be split into two equations (A) €,v + 2,4 = 0 and 
(B) fou — &,v = 0 by putting &2/&>) = u/v and €,/§2 = —u/v. For various values of (u, v), each of 
these equations represents a pencil of lines, which are therefore related by a projective mapping, 
so that for the same values of (u, v) a point of the conic is determined as their point of intersection 
(Fig.). The vertex of the pencil (A) is A, since €, = 0, €, = 0 for a, = 0, az = 0, and the vertex 
of the pencil (B) is B because by = 0, b2 = 0. 

The line / = AB is then characterized by &, = 0, the line /, = AC by &,; = Oand the line /, = BC 
by > = 0. The projective mapping of the pencil A onto the pencil B takes the line / = AB of A 
(v = 0) into the line /, = BC of B and the line /, = AC of A (u = 0) into the line / = AB of B, 
and therefore has no fixed line, that is, it is not perspective. 

Conversely, two pencils with vertices A and B that are projectively, but not perspectively, related 
generate a conic as the set of points of intersection of corresponding lines. If the line / = AB of the 
pencil A goes into the line /, of the pencil B, and if / as a line of the pencil B is the image of the 
line /,; of the pencil A, then one can take the points A, B and C = (/, ~ /,) as base-points of a co- 
ordinate system, whose unit point U is the point of intersection of any two corresponding lines p 
and p’. In this coordinate system the line / is characterized by the equation €, = 0, the line /, by 
€, = 0, the line /, by &) = 0 and the lines p and p’ by &, + 2 = 0 and £9 — €, = 0. However, 
the collineation of the pencil A into the pencil B that is determined by the correspondence of lines 
&,v + €&,u = 0 and ou — &,v = 0 with the same ratio u: v, has the property that the three lines 
l, 1, and p of the first pencil go over into the three lines /,, / and p’ of the second pencil. According 
to the main theorem, this coincides with the given collineation. By eliminating u and v from the 
equations of the a one sees that the point of intersection of corresponding lines describes the 
conic €)&, + &% = 

In perspectively aie pencils, corresponding lines meet on the line from which the projection 
takes place, that is, a degenerate conic is formed. 

To construct a conic that passes through three given non-collinear points A, B, P and has two lines 
1, and /, as tangents at A and B, it is sufficient to form a projective mapping of the pencil A onto 
the pencil B. According to the main theorem, such a collineation is uniquely determined by the 
condition that the lines /,, /= AB and p = AP of the pencil A correspond to the lines /, /, and 
bp’ = BP of the pencil B. The points of intersection of corresponding lines in this collineation uniquely 


25.5. Conics 559 


generate a conic, which passes through the points A, B, P and has /, and /, as tangents at A and B 
(Fig.). 
f 


25.5-5 Construction of a conic from three 
points and two tangents 


25.5-6 Construction of a conic from five 
points no three of which are collinear 


This collineation can be constructed by 
means of a central collineation between 
two lines a and b intersecting at P, which 
cut the lines /,; and / in points L, and L, 
and the lines / and /, in points L, and L2, 
with centre S = (L,L, ~ L,L2). 

If a conic is to be constructed through 
five given points, S, 7, U, V, W, no three a 
of which are collinear, one can pick out 
two of them as vertices of pencils that are 
projectively, but not perspectively related iF 
(Fig.), for example, S with s, = SU, 
S2 >= SV, 53> SW and T with i= TU, 
t, = TV, t3 = TW. The collineation of the pencil S onto the pencil T that is uniquely determined 
by the condition that the lines s; go into the lines t,, then defines a conic, which consists of the 
points of intersection of corresponding lines. This collineation can be represented as a central 
collineation between two lines /, and /, intersecting at the point V. For example, if /; = VW and 
/, = UV are chosen, then the centre C = (s, - ¢3) of this perspectivity is the point of intersection 
of the lines s; and f3. To a line s4 cutting /; in a point X, there corresponds the line t, = TX>, 
where X2 = (/2 ~ X,C) and the two lines cut in a point X of the conic. 

Since the collineation is uniquely determined by the five given points, and the conic is in turn 
uniquely determined by this collineation, this construction gives a unique conic through these five 
points. 


There is exactly one conic that passes through five given points no three of which are collinear, 


As the dual of the generation by projective pencils, a conic can be generated by means of projec- 
tively related lines. The lines joining pairs of corresponding points of two lines / and /’ envelop a 
non-degenerate conic if the two lines are not perspectively related (Fig.). If the two lines are projec- 
tively related, the degenerate conic envelope consists of only two points, the centre of projection 
and the point of intersection of the two lines. 

The projective correspondence of the points of the lines / and /’ is determined by three pairs of 
points. If A, B, C on / and their images A’, B’, C’ on /’ are given, then this correspondence can be 


560 25. Projective geometry 


25.5-7 The lines joining projectively corresponding points of the lines / and /’ envelop a conic 


similarly realized by means of a perspectivity between two pencils, whose vertices are, for example, 
the points S,; = (AA’ ~ BB’) and S2 = (AA’ ~ CC’), with the line c = CB’ as ‘centre’. The point 
P’ corresponding to any further point P is determined by the fact that the point (S,P ~ S,P’) lies 
on c. The lines AA’, BB’, CC’, P,P, ... are tangents of the conic. 

Just as one can construct a conic through five points, so one can construct exactly one conic 
that touches five given lines no three of which are concurrent. 


Pascal's Theorem. If six points of a conic are regarded as vertices of a hexagon, then the ts 
of intersection of opposite sides are collinear (Fig.). ” 


Let Aj, B,, Ci i: Gos B, C2 be the vertices of the hexagon, A3 = (B,C, ON B,C); 
Bz = (CyA2 ~ C2A,), C3 = (A,B, ~ A2B,) the points of intersection of opposite sides and 
D = (A, Bz ~ A2C,) and E = (A,C, ~ B2C,) two further points of intersection. Then the pencils 
with vertices A, and C, are projectively related by the fact that corresponding lines meet on the 
conic. To this collineation there corresponds a collineation of the line A,B, onto the line C,B, 
under which the points 4,, C3, D, Bz go into the points E, A3, C,, Bz. Since the point B, is fixed, 
the collineation is a central projection. The centre of this projection is the point of intersection of 
the lines 4, E and DC,. Hence the line A3C3 also passes through the point B3. 


25.5-8 Pascal’s theorem 


25.5-9 Construction of a conic from five points by Pascal’s theorem 


Pascal’s theorem gives another possible construction for a conic through five points. Let A,, 
B,, C,, Az, Bz be these five points no three of which are collinear. Then the lines A,B, and A,B, 
meet in the point C3. An arbitrary line / through C3 meets the lines C,A2 and C, B; in the points B, 
and A,. Then the lines A,B; and B,A; meet in a point C2, which describes a conic through A,, B,, 
C,, Az, Bz when / runs through all the lines through C3 (Fig.). 


Dual form of Pascal's theorem: Brianchon's theorem. If six tangents of a conic are regarded as 
the sides of a hexagon, then the lines joining opposite vertices are concurrent (Fig.). 


26.1. Differential geometry 561 


Dual to the points of intersection A3, B;, C3 of opposite sides of the Pascal hexagon, one has now 
the lines a3, 53, c3 joining opposite vertices of the Brianchon hexagon. Dual to the Pascal line p 
through A3;, B; and C; one has the Brianchon point P as the point of intersection of a3, b3, c3. 

Brianchon’s theorem gives a method, suitable for a draughtsman, of constructing a conic with 
five given lines as tangents. No three of the five given lines a, , a2, b;, b2, c; should be concurrent 
(Fig.). Of the three lines joining the points of intersection of opposite tangents, only one, c3 = 
{(a; ~ b2), (61 ~ a2)] is fixed. The other two lines a3; and b3 are determined by any position of the 
Brianchon point P on c3, and by means of their points of intersection with the tangents b, and 
a, the sixth tangent cz is fixed, as the line through these points. This construction gives the conic as 
the envelope of all lines which, together with the five given lines, make up a Brianchon hexagon. 


25.5-10 Brianchon’s theorem 


25.5-11 Construction of a conic from five tangents by Brianchon’s theorem 


26. Differential geometry, convex bodies, integral geometry 


26.1. Differential geometry ............ 561 Riemannian geometry ............4. 573 
Theory of curves in Euclidean space. 561 26.2. Convex bodies ...............025 574 
Theory of surfaces in Euclidean space 565 26.3. Integral geometry................ 574 
Klein’s Erlangen programme ....... 572 


26.1. Differential geometry 


In differential geometry the concepts and methods of analysis, particularly of differential calculus 
and the theory of differential equations, are applied to the study of geometric figures. The under- 
lying geometrical spaces or manifolds must, as in analytical geometry, be referred to coordinates. 
Other geometrical figures are embedded in these spaces, for example, general curves or curved 
surfaces, which are characterized by sufficiently differentiable equations or functions. To understand 
the more advanced parts of differential geometry, one must be fully conversant with the tensor 
calculus; furthermore, a knowledge of topology and some other branches of mathematics is essential. 


Theory of curves in Euclidean space 


Definition of a curve. Let e,; (¢ = 1, 2, 3) be three pairwise orthogonal unit vectors, which form 
an orthonormal trihedron of 3-dimensional Euclidean space E3, and let x; (i = 1, 2, 3) be Cartesian 
coordinates with respect to this trihedron. By a parametric representation of a portion of a curve one 
understands the specification of the coordinates x; of a point of the curve as functions of a real 
parameter ¢ on an interval a<t<b:x,;=/f,(t) @= 1,2, 3). These three equations are usually 
summed up in one vector ‘equation 


3 
x = x(t) = ZK) ej. 


562 26. Differential geometry, convex bodies, integral geometry 


The functions f,(t) must be continuously differentiable sufficiently often; it is usually sufficient to 
postulate the existence and continuity of the first three derivatives. By a curve one understands a 
connected set of points C such that for any point P of C there is a neighbourhood U such that the 
points of C lying in U can be represented as a portion of a curve. The parameter of a portion of a 
curve can be chosen almost arbitrarily; if ¢ is a parameter, one obtains any other parameter t’ by 
means of a parameter transformation t’ = y(t), where the function y(t) is continuously differentiable 


sufficiently often and its derivative + is never zero. Of importance are only the geometrical 


properties of the curve that are independent of the special choice of the parameter, and not the more 

incidental analytical form of the representation. Often the coordinate system in Euclidean 3-dimen- 

sional space E; and the parameter ¢t on the curve C can be chosen so that the functions representing C 

are as simple as possible and the calculations become easier. 

A curve can also be given by an implicit representation, by means of two independent equations 
of the form g(x, »X2>5 x3) = 0, h(x, »X2>5 x3) = 0, 

that is, geometrically as the intersection of two surfaces g=0, 

——. h = 0. One of the simplest space curves is the circular helix, 


ga ) which can be represented in the form 
no , | x(t) = a(e; cos t + e sin t) + dte3. 
{ goa ao The thread of a screw is a circular helix, where 2a is the dia- 
= — be meter and 276 the pitch of the screw (Fig.). 


ee a; 26.1-1 Circular helix 26.1-2. Normal plane N at a point Py 


Tangents. If one draws a line (secant) through two points P,, P, of the curve C and then lets 
P, and P, tend to a fixed point Pp of C with position vector x9 = x(tg), the secant tends to a line 
through Po, which is called the tangent to C at Po. The existence of the tangent is ensured by the 
differentiability condition imposed above if the point Po is regular, that is, if at least one of the 
dfi(to) dx(fo) : Sito) 5, 

dt de =—2¢ dt 

Non-regular points are called singular, and their properties must sivas - ene separately. 

At a regular point Po, Xo is a direction vector of the tangent at Py). For the position vector y of a 

point on the tangent one obtains the given equation 

[Equation of the tangent [y= xo Fxor] by introducing a parameter 1(—o <1 < 0). 

The plane through Po perpendicular to the tan- 

gent is called the normal plane N of C at Po (Fig.). If z is the position vector of a point of it, and 
a:b denotes the scalar product of the 


mipaNis 


mal plane N 1s: 


derivatives is non-zero; in vector form, Xo = 


Osculating plane. Suppose that the curve C is not a straight line. Then, in general, three arbitrary 
points P;, P2, P3 of it do not lie on one line. Any three such points therefore determine a plane. 
If P,, Pz, P3 tend to the same point Pp of C, then in the limit their plane converges to a plane 
through Po, which is called the osculating plane T of C at Po (Fig.). Its existence is ensured if the 
first two derivatives of the position vector x(t) are linearly independent for t = fo, that is, if xp X Xp 0. 


A x(to) 3 dfilto) 


Here Xo = a2 = 2 ei 472 2 x 5 denotes the vector product of the vectors a and b. 


26.1. Differential geometry 563 


If z is the position vector of a point of the osculating plane T and (a, b, c) denotes the scalar triple 
product of the vectors a, b and c, that is, (a, b, c) = (a x b)- c, then the equation of the osculating 
plane ts 


The curve has contact of the first order with its tangents, that is, for a suitable choice of the parameter 
the first derivatives of the curve and the tangent coincide at the point of contact. The osculating plane 
can be defined as the plane that has contact of the second order with the curve at the point Po, 
that is, the first two derivatives Xo, Xo must lie in it. If x» X X= o, the osculating plane is uniquely 
determined. It is the plane that is spanned by the vectors x(fo) and X(fo) at Po. The plane perpen- 
dicular to the osculating plane and normal plane is called the rectifying plane R at Po. If z is the 
position vector of a point of it, then one obtains: 


26.1-3 Osculating plane 7 at a point Pp» 
26.1-4 Moving trihedron of a curve 


Normals. Any line that lies in the normal plane and passes through Pp is called a normal of C 
at Po. The normal lying in the osculating plane is called the principal normal to C at Po, and the 
normal lying in the rectifying plane is called the binormal. A direction vector of the principal normal 
is (Xo X Xo) X Xo, and a direction vector of the binormal is Xo X Xo (if Xo X ¥o + 0). If at each 
yoint P of the curve one draws three vectors ¢, n, 6b of length 1 in the directions of the tangent, prin- 
cipal normal and binormial to C, then one has an orthonormal trihedron, which is called the (uniquely 
determined) moving trihedron of the curve. The idea of a moving trihedron (or n-hedron) has proved 
very fruitful not only in the theory of curves, but in differential geometry generally (Fig.). 


Arc length. The length of a polygon in E3 can be defined as the sum of the lengths of its segments 
Ax. The curves considered in differential geometry can be approximated arbitrarily closely by poly- 
gons. Then the length 2 4x of the approximating polygon tends to a limiting value /, which is called 
the length of the curve. For a portion of a curve C with the parametric representation x = x(t), O<t<a, 
one has Ax = x(t + At) — x(t), and it can be proved, under the usual differentiability conditions, 
that the length of a portion of a curve, as described above, is equal to the integral 


a a3, 
I= f |x| de a LLOyV* dt; 


here |x| = /(x + x) denotes the length of the vector x. If one consi- 
ders only the arc C, from the point with the parameter 0 to the point 
with the parameter f, then the length s of this arcis a function of t: 


s=s(t)= itso) dt. 


If C is regular, it follows that ds/dt = |x()| > 0, and s can be in- 
troduced as a new parameter, which is characterized geometrically. 
This parameter s is called the arc length of C or the natural para- 
meter (Fig.). The derivative of the position vector with respect to 
arc length is the unit tangent vector 


t=x =dx/ds, |x’|=1. 26.1-5 Definition of arc length 


564 26. Differential geometry, convex bodies, integral geometry 


2 
It follows that x’: x’’ = 0, that is, x” = Gis per- 


pendicular to x’ and is therefore a direction vector 
of the principal normal, provided that x’’ + oa. 


t(s+As) 


Curvature. Let x = x(s), O<s</, be a portion 
of a curve C in terms of the arc length. The tangent 
vectors f(s) and ¢t(s-+ 4s) at the points P(s) and 
P(s + As) with parameters s and s-+ 4s form an 
angle 4a with one another. If now As tends to zero, 
then there exists the limiting value 

lim ois 

As+0 As 
which is called the curvature of C at the point P(s) 
(Fig.). For a straight line one always has 4a = 0, that 
is, the curvature x(s) is identically zero. 

The curvature is therefore a measure of the devia- 
tion of the form of the curve from a straight line. If 
s is the arc length of C, then x(s) = |[x’(s)]. 

If m denotes the suitably oriented unit vector in the direction of the principal normal, then 
x’’ = x(s) mn. Hence x” is called the curvature vector. 


= x(s), 


26.1-6 Definition of curvature 


Torsion. The osculating plane of a plane curve at any point is the same as the plane in which 
: . . 4 x 4+ ; 
the curve lies. The unit binormal vector b = Eel of a plane curve is therefore constant (and 
vice versa). The variation of the binormal vector b, which is perpendicular to the osculating plane, 
is therefore a measure of the variation of the osculating plane, and also a measure of the deviation 
of C from its projection onto the osculating plane at the point of C under discussion. If 48 
denotes the angle between the binormal vectors b(s) and b(s + 4s) at the points and P(s) and 
P(s + As), then there exists, in general, the limiting value 
; AB 
lim —— = 7(s), 
4:0 As ( ) 


which is called the torsion of C at the point P(s). 


Natural equations. Curvature, torsion, and arc length are invariants for a curve under Euclidean 
motions, that is, if a curve (made of wire, for example) is moved as a rigid body in space, then the 
curvature, torsion, and arc length do not change. In addition, these quantities do not depend on 
the arbitrary choice of parametric representation x = x(t), hence are also invariant under parameter 
transformations. These two invariance properties follow immediately from the definitions given 
above. The three quantities s, x, 7 are connected by the two equations 


x=xs)>0, t=Xs), 


which are called the natural equations of the curve. The following theorem can be regarded as the 
main result of curve theory; it states that «(s) and t(s) form a complete system of invariants for C: 


For any given continuous functions x = = (s) > 0 and r= r(s), there is one, and, to within a Euclidean 
motion, only one curve C such that »(s) is the curvature and r(s) the torsion of C. 
The Frenet formulae. The proof of this theorem is conducted by the method of the moving tri- 


hedron (Fig.). For the variation of the vectors f(s), n(s), b(s) of the moving trihedron the following 
Frenet formulae hold: 


daz : : 
26.1-7 Resolution of "age with respect to the moving tri- 
hedron f, vn, 5 


26.1. Differential geometry 565 


. dt seoe 
The first of these equations has already been proved, because ——- = x” is the curvature vector. 


ds 
Since n, t, 6 are pairwise perpendicular unit vectors, one has za-n=b:b= t J = 1] and 
n- b=n-t=b:t=0; by differentiation b°-b = 0, n° n’=0, t’: b= —t-B, n-t=-—t’ 


n’-b = —b’-n. By scalar multiplication of b’ = «3t + B3n + 36 and n’ = — ast + Ban + ob by 
suitable vectors a, t or 5, one can determine the components «;, B;, y; (i = 3, 2). One obtains: 


b’:-b=y3=0; b’-t= Oo; = —t’- b= —xn'b=0; b’ = Ban. By the definition of torsion it 

follows that |b’| = |83| = |t|. For consistency with t(s) = jim Fs > One must put b’ = —t(s) n. 
+0 

For the second equation: 8, =n: n =0;0,=n:-t= _y. = —x(s) andy, =n -b=—DbD':n 

= 1(s), that is, a’ = —xrt + tb. 


For given continuous functions x(s) > 0 and t(s) the Frenet formulae are a system of linear 
differential equations to determine f, n, b. Once f(s) has been obtained, the curve can be found by 


. . dx . . . 
integration of a t(s); for example, the circular helices are characterized as the curves whose 


curvature and torsion are constant. 


Theory of surfaces in Euclidean space 
Definition of a surface. If the three coordinates x, (¢ = 1, 2, 3) of a point of E3 are given as func- 
3 


tions of two parameters u and v, x; = f,(u, v) (i = 1, 2, 3), or in vector form, x = x(u, v) = J f(y, v) e;, 
i=1 


where u and v vary in a certain domain D of a plane, then this is called a parametric representation 
of a portion of a surface. 

A connected point set S of £3 is called a surface if for each point P of S there is a neighbourhood U 
such that the points of S that lie in U have a parametric representation as a portion of a surface. 
If the values of the parameters u and v (also known as coordinates on the surface) are given, then the 
position of the point on the portion of the surface is uniquely determined; for example, a point on 
the earth’s surface is fixed by its latitude and longitude. Within wide limits, the parameters of a portion 
of a surface can be chosen arbitrarily; instead of u and v one can take as parameters uv’ and wv’ ina 
domain D’ of the plane if there is a one-to-one transformation of the form 


Ou’ sO’ 
u’ = u'(u, v) “Ou OOD 
with the determinant| | , , | #90; 
v’ = v'(u, v) Sv Ot 
Ou Ov 


this is called a parameter transformation. The geometrical concepts of the theory of surfaces must 
be invariant under Euclidean motions and under parameter transformations. A surface can also be 
given implicitly by an equation g(x;, x2, x3) =0 


Tangent plane. In order to investigate properties of a surface in the neighbourhood of a point Po 
with the parameters &# and vg, one considers curves that lie on the surface and pass through Po. 
An arbitrary curve of this kind can be given in a parametric representation of the form 

x(t) = x(Ug + u(t), Vo + v(t), 
where u(0) = v(0) = 0. In the special case u = ug + t, v = Uo, that is, v(t) = 0 for all t, one obtains 
the parameter curve through Po, along which v = vg is constant. Similarly, if u= up, v—=v9 + ft, 
one has the other parameter curve, along which u = up is constant. The point Pop is the point of 
intersection of these parametric curves. In geographical coordinates on the earth’s surface, the merid- 
ians and circles of latitude are the corresponding parameter curves. 

The tangent vector of a curve through Po at the point t = 0, that is, at Po, is obtained by dif- 
ferentiation of the parametric representation 


dx _ 9X0 _ du(0) OXo  dv(0) 
“dt leo |) OM dt dv dt ’ 
where Oxo _ = 8x(Uo» Yo) ; Xo. = 9x(Uos Yo) From these formulae one sees that: the vectors 
Ou Ou Ov Ov 


0Xo . 
On and AD are tangent vectors to the parameter curves. If these two vectors are linearly 


independent, then all the tangent vectors to curves that lie on the surface and pass through Pp lie 


in the plane through Pp spanned by seo and So This plane is called the tangent plane to the 


surface at P,. If a and b are parameters of points of the tangent plane and z is its position vector, 


566 26. Differential geometry, convex bodies, integral geometry 


then one obtains a parametric representation of the tangent plane: 


re Ov_ 


These formulae make sense only if So and So. are linearly independent; points at which this 


is so are called ee : it is not so, the point is called singular. adinpoa a point Po of S is regular 
0Xo 
if and only if —— aa —°+ 0. 


In geographical eon nates on the earth’s surface, the 
poles are singular. The vertex of a circular cone is singu- 
lar for any parametric representation, because no tan- 
gent plane to the cone exists at this point. In differential 
geometry one deals almost exclusively with regular sur- 
faces; singular points must always be treated separa- 
tely. 

The line through Pp perpendicular to the tangent plane 
is called the normal (Fig.). 


26.1-8 Tangent plane and surface normal 


Intrinsic geometry. In the early years of differential geometry, surfaces were regarded as the 
outer boundaries of solid bodies, or as ‘infinitely thin’ solid bodies, embedded in 3-dimensional 
Euclidean space. The geometer Gaspard MONGE (1746-1818) can be regarded as a founder of this 
way of thinking; he wrote the first textbook on differential geometry (Application de l’ Analyse a la 
Géomeétrie, Paris 1809). In connection with practical questions of geodesy, GAuss posed the question: 
how can one draw conclusions about the spatial form of a surface from measurements on the sur- 
face itself? — This problem is of great importance for the determination of the form of the earth, 
which was regarded originally as a sphere, then as a flattened ellipsoid of rotation and today as 
a surface that cannot be represented in an elementary way, a so-called geoid. The investigation 
of these questions led Gauss to the intrinsic geometry of the surface; he described this in his 
treatise Disquisitiones generales circa superficies curvas (1827). In this part of surface theory, sur- 
faces are regarded not as solid bodies, but as skins, which can be bent, but not stretched. By a 
bending of a surface one understands a continuous deformation of the surface under which the 
lengths of all curves on the surface remain fixed. More generally, two surfaces S and S’ are called 
isometric if there is a one-to-one correspondence P’ = y(P) between the points P of S and the 
points P’ of S’ such that curves that are transformed into one another by the correspondence 
have the same length. The correspondence ¢ is then called an isometric mapping or isometry; for 
example, a cone can be isometrically mapped onto a region of the plane if it is cut along one of its 
generators. A bending of a surface S into a surface S’ is also an isometric mapping of S onto S’; 
however, two isometric surfaces need not have a continuous bending from one to the other. Properties 
of surfaces that do not change under isometric mappings can be established by measurements on 
the surface; they form the content of the intrinsic geometry of the surface. In this sense, plane geometry 
is the intrinsic geometry of the plane and spherical trigonometry is the intrinsic geometry of the 
sphere. 


Element of arc of a surface. The intrinsic geometry is completely governed by the element of arc 
of the surface. Let x = x(t) = x(u(t), v(t), fp <t<t,, be a curve C lying on the surface S. By 
differentiating with respect to ¢ one obtains the tangent vector 

dx dx du Ox dv 


~ dt ou dt ov dt’ 


2 
From the definition of arc length s(t) of C one has a = |x|? = x- x. Substituting the above 
expression for x one obtains t 


Oe ee ae ee & (ay 
(=) ou du \ dt ou ov dt dt dv Ov \ dt 
If, following GAuss, one introduces the notation 
a Ox Ox Ox Ox Ox 
BM Se au? HO ae ay? = ay Se 


26.1. Differential geometry 567 


and instead of writing the derivatives with respect to ¢ one writes only the differentials, then one 
obtains the element of arc or the first fundamental form of the surface in the form 


The length / of the curve C is expressed in terms of the element of arc as follows: 


hi th 
du \? du dv dv \? 
== bey = E{— Exasacngriesied eee , 
I [ve x) dt Vl (+ +2 So +6(2) | ar 
to to 
in the integration naturally one has to substitute for the arguments u and v of E, F, G the equations 
u = u(t), v = v(t) of the curve. 

By means of the first fundamental form one can not only calculate arc lengths, but also define 
and determine all the quantities that can be found by measurements on the surface; for example, 
the angle between two curves that lie on the surface S and meet at a point Pp of S, and also the 
area of a point set lying on S can be defined in this way. 

The area A(U) of a domain U of S is lu+Au,v +Av) 

A(U) = f{(EG — F?)'!? du dv. 
The integrand dA = (EG — F7*)!/2 dudv is called the ele- 
ment of area of S, and can be thought of intuitively as the 
area of an infinitely small mesh of parameter curves (Fig.); 
(EG — F?)'/? Au Av is the area of the parallelogram spanned 
0 ) ; 
— Au and — Av. In the calculation of A(U), 
the integration is to be taken over all parameters u and v 
for which x(u, v) lies in U. (u,v 


by the vectors 


The element of arc determines the intrinsic geometry of 
the surface completely: two surfaces S and S’ are isometric 26.1-9 Definition of an element of 
if and only if one can find parametric representations of area 
them for which the elements of arc coincide. 


Geodesics. If among all curves that lie on the surface and pass through two points P, and P, 
there is one of least length, it is called a shortest curve. The determination of shortest curves of a 
surface is one of the oldest problems of differential geometry and the calculus of variations. Given 
two points in a plane, there is only one shortest curve through them, namely the line segment joining 
them. There may be pairs of points on a surface that cannot be joined by a shortest curve. On the 
other hand, it can happen that through two points there is more than one shortest curve, even 
infinitely many; for example, for two diametrically opposite points of a sphere, any great semicircle 
through the points is a shortest curve. However: 


If U is a sufficiently small neighbourhood of a point P, of the surface, and if P, is another point 
of U, then there is one shortest curve connecting P, and PF). 


A curve C lying on a surface S is called a geodesic if it is a shortest curve between any two of its 
points that are sufficiently close. On a sphere the great circles are geodesics but obviously need not 
be shortest curves; a great circle is divided into two arcs by two of its points, which in general have 
different lengths, so that only the smaller one is a shortest curve. An arc of a great circle whose 
length is greater than 7R, where R is the radius of the sphere, is therefore not a shortest curve, but 
it is a geodesic. The differential equation of geodesics of an arbitrary surface is one of the second 
order that depends only on the first fundamental form. 


Through any point of a regular surface there is exactly one geodesic in any given direction. Two 
points of a complete (intuitively, ‘rimless’) surface can be joined by a shortest curve, and so also 
by a geodesic. 


Parallel displacement. The idea of parallel displacement can also be carried over to an arbitrary 
curved surface. At a point Po of a geodesic g, if a tangent vector ag = a(Po) to the surface S forms 
with the tangent vector fo = t(Po) to the geodesic an angle «, then at a point P of the geodesic one 
obtains the vector a(P) parallel to a(P,) along g by constructing from P in the tangent plane the 
vector a of length |a)| which forms the same angle « with the tangent vector t = t(P) of g. It follows 
from this definition that tangent vectors of constant length of a geodesic (as for a straight line) 
are given a parallel displacement along it; here « = 0. If this definition is also applied to curved 
polygons, whose components consist of geodesics, and if one thinks of an arbitrary curve on S 
as approximated by such geodesic polygons, then one obtains an intuitive idea of parallel displacement 


568 26. Differential geometry, convex bodies, integral geometry 


of a tangent vector along an arbitrary curve of the surface. The most important difference between 
parallel displacement on a curved surface and that in affine (or Euclidean) space is that in the first 
case the parallel displacement depends on the curve along which it takes place. If one displaces a vector 
around a closed path on a surface, then, in general, one does not get back to the original position 
(Fig.). In the figure, there is an angle of n/2 between the original vector ao and the vector a3 displaced 
around a spherical triangle with three right angles. 


26.1-10 Parallel displacement along a spherical triangle 


26.1-11 Normal curvature 
and geodesic curvature 


Curvature of a surface. In order to investigate curvature properties in the neighbourhood of a 
point Py of S, one considers the curvature of curves that lie on S and pass through Po (Fig.). If 
Xo is the curvature vector of a curve C at Po, one projects it onto the normal to the surface and 
obtains x9 = %”Mo + ko, where ko + mo = 0, so that ko is also a tangent vector; the curvature 
vector Xo of the curve is decomposed into its tangential component ky and the normal component 
2 nNo perpendicular to it. The length x, of the projection onto the normal, taken with the appropriate 
sign, is called the normal curvature of C at Po. The length x, = |ko| of Ko is called the geodesic 
curvature. It follows immediately from the resolution of the curvature vecto1 x,’ into the normal 
component x,/t9 and the tangential component ko that the (total) curvature x(s) = |x’’(s)|, the 
normal curvature x, = x’(s)- ko, and the geodesic curvature x, = [Kol are linked by the relation 
x? = x2 + x2. The geodesic curvature is invariant under bending, and is therefore a concept of the 
intrinsic geometry, while the normal curvature depends on the embedding of the surface in space; 
for example, in a plane the normal curvature of every curve is obviously equal to zero. If one now 
bends a strip of the plane into part of a circular cylinder of radius r, then every generating circle 
of the cylinder has the normal curvature 1/r. 

The geodesic curvature of a curve on a surface can be defined by means of parallel displacement, 
just like the curvature of a space curve. Geodesics can be characterized as curves on a surface whose 
geodesic curvature vanishes. On a circular cylinder, the circular helices, the generating lines, and 
the circles perpendicular to the generators are the geodesics; when the cylinder is developed into a 
plane, by the isometric mapping obtained by cutting along a generator, the geodesics go into seg- 
ments or straight lines. 

The normal curvature is given by the second fundamental form of surface theory; if u = u(s), 
v = v(s) are the equations of the curve C, where s is its arc length, then the second fundamental 
form of the surface is: 


a2x a2x a2x 
where L=n Ba? M=nhn nado” N=n Ao? 


It follows that the normal curvature depends only on the direction of the curve at the point Po. 
All curves on S having the same tangent at /, also have the same normal curvature there. 
A more precise investigation of the normal curvature leads to a classification of points of the 


surface. Firstly, if all the quantities L, M, N vanish at Po, as is the case for every point of a plane, 
then Po is called a flat point. If this is not the case, one distinguishes three types of point. 


If the Gaussian curvature of the surface at a point P with the coordinates (u, v) has the value K(P) > 0, 
then P is called elliptic, if K(P) < 0, then P is called hyperbolic, and if K(P) = 0, P is called parabolic 


26.1. Differential geometry 569 


(Fig.). This purely formal division has a close connection with the shape of the surface. On a bicycle 
tube (torus), for example, the points towards the inside are hyperbolic, and the points toward the 
outside are elliptic; these two point sets are separated from one another by two circles, which 
consist of parabolic points. An ellipsoid has only elliptic points, a hyperbolic paraboloid (saddle 
surface) has only hyperbolic points, and a circular cylinder has only parabolic points (Fig.). 


‘> O ellipti 26.1-12 Classification of the points of a surface of revolu- 
7 ) elliptic tion (bell) 


y= O. parabolic 


,K< O hyperbolic Pe 


26.1-13 Elliptic (P.), parabolic (P,) and hyperbolic (P,) point 


Theorema egregium. The first and second fundamental forms assigned to a surface are invariant 
under motions, that is, if the surface is moved in space as a rigid body (without altering its shape), 
then the fundamental forms do not change. If the surface is bent, that is, deformed isometrically, 
then the first fundamental form remains unchanged, while the second fundamental form, which 
determines the normal curvature, changes. The first fundamental form is a bending invariant. 

Gauss showed that the Gaussian curvature ist not only invariant under motions and parameter 
transformations, but also under bending. He called this striking and unexpected result the Theorema 
egregium (remarkable theorem). 


Theorema egregium. The Gaussian curvature K remains invariant under isometric mappings. 


For a proof of this theorem one derives a formula for K in which only the coefficients of the first 
fundamental form and their derivatives appear. Since these are bending invariants, K must also 
be a bending invariant. By a suitable choice of parameters u and v it can be arranged that the parameter 


curves of the surface cut at right angles, that is, S* , OX = F = 0. If this is assumed, then the 
Theorema egregium is expressed by the formula 2% 9 
eS ee A yG\, 2f 1 owe 
~ EG VG) [du \VE ou J * VG ov JI’ 


It follows, for example, that a sphere of radius r, for which the Gaussian curvature at every point 
is equal to 1/r?, cannot be mapped isometrically onto a plane, for which K = 0. 

It is therefore impossible to draw a faithful map of part of the earth’s surface; only by restriction 
to a sufficiently small domain can one obtain an approximately accurate representation. 


Determination of a surface from the fundamental forms. The Theorema egregium is directly con- 
nected with the following question: let 


gp, = Ef? + 2FEn + Gn?, gy. = LE? + 2MEn + Nn’, 


be two quadratic forms whose coefficients are functions of the two variables u and v; furthermore, 
suppose that , is positive definite. Is there always a surface S for which 9, is the first and g2 the 
second fundamental form? — This problem is analogous to the determination of a curve with given 
curvature and torsion. But in contrast to this simpler problem, where curvature and torsion 
can be given independently of one another, in the case of a surface the two fundamental forms 
cannot be chosen independently of one another; their coefficients are connected by three conditions, 
the so-called integrability conditions, which hold on every surface. One of these conditions is’ given 
in the previous section, the equation that expresses the Theorema egregium (under the assumption 
that F = 0); the other two integrability conditions are called the Codazzi-Mainardi formulae. If 
these three conditions are satisfied for y, and 2, then, at least for a sufficiently small region U 
of the variables u and v, there is always a portion of a surface that has g, and ¢2 as fundamental 
forms; two distinct such portions of surfaces defined on U are congruent. 


570 26. Differential geometry, convex bodies, integral geometry 


The Gauss-Bonnet theorem. By integrating the Gaussian curvature K, multiplied by the element 
of area dA, over a domain U of the surface S, one obtains the so-called integral curvature K(U) 
of the region 


K(U) = ff KdA = ff K(EG — F?)!/? du dv, 
U 
which is naturally also a bending invariant. 

An intuitive interpretation of the integral curvature, and therefore of the Gaussian curvature, 
is obtained by investigating the spherical image of a domain U of the surface S. This spherical image 
is obtained by drawing the normal unit vector nm of a point P of U from a fixed point, say the origin O. 
The ends of these vectors then describe a domain V of the unit sphere, which is the spherical image 
of U. The area of the spherical image is then (apart from the sign) equal to the integral curvature of U. 

If Uis bounded by a simple closed curve C, then the integral curvature K(U) can be expressed as 

an integral around C. Here the 
" and s is arc length along C. 
think of a closed surface intuitively as the boundary of a finite smooth body pierced by g holes; 
the number g is called the genus of the surface; examples of closed surfaces are the sphere (g = 0), 
The integral curvature of a closed surface S of genus g does not depend on the shape of the surface 
and is equal to 


It is intuitively obvious that this area is the larger the more sharply S curves. 
Gauss-Bonnet theorem holds, in 
A particularly interesting result is obtained by applying this theorem to closed surfaces. One can 
the torus (g = 1), and the ‘pretzel’ (g = 2) (Fig.). 
K(S) = jj Kea = 4(1 — g). 


This result is of great importance, since it makes it possible to express topological properties of 
the surface, in this case the genus g, which remain invariant even under arbitrary continuous defor- 
mations, in terms of quantities of differential geometry (here the integral curvature). Its generaliza- 
tions and similar questions have led in recent decades to the development of one of the most inter- 
esting and difficult branches of modern geometry in which connections between properties of 
geometrical forms in differential geometry and in topology are also investigated in higher dimensions. 


circle of 
laritude 


g=2 (pretzel) 


€; 


26.1-14 Surfaces of different genus g 


26.1-15 Definition of a surface of revolution ei) 


Surfaces of revolution. By a surface of revolution one understands a surface that arises by rotation 
of a plane curve about an axis lying in the plane of the curve. Such surfaces with rotational symmetry 
occur frequently in practice. To obtain a parametric representation of a surface of revolution one 
takes the axis of rotation as the e3-axis of a rectangular coordinate system in space (Fig.). In the 
€,,€2-plane perpendicular to it, a unit vector e(v) is defined by 


re de : _ 
e(v) = e, cosv + e2 sinv, whose derivative is e*(v) = ‘ap —e, sin v + e, cos v = e(v + 71/2). 


26.1. Differential geometry 571 


It follows immediately that for all values of v the vectors e(v), e*(v), e; form a right-handed orthonor- 
mal trihedron. A plane H(v) passing through the origin and spanned by e(v) and e3 rotates about 
the e3-axis as v varies. Any fixed curve x(u) = &(u) e + 7(u) e3 in H(v), for which u is the arc length, 

dx dx 


SO that “an : di 


Hence one obtains a parametric representation of the surface of revolution generated by x(u) in 
the form 

x(u, v) = €(u) e(v) + nl) e3. 
The parameters of the surface are u and v. The parameter curves are the meridians, which are the 
generating curves x(U, Vo), Vo = const, and the circles of latitude x(ug, v), Uo = const, which are 
the circles of intersection of the surface with planes perpendicular to the axis of rotation. To find 
the singular points of the surface — the generating curve is assumed to be regular — one calculates 

Ox Ox , , : , 

re Bp = (e+ Mes) x €e* = E(—n’e + Fes), 
where the dash denotes differentiation with respect to the arc length u of the generating curve. 
The vector nm = —n’e + €’e3 is the normal unit vector of this curve and therefore a normal vector 
of the surface. A point of the surface is singular if and only if § = 0, that is, if the point lies on the 
axis of rotation. For the coefficients of the first fundamental form one obtains immediately 


= 1, generates a surface of revolution as the plane H(v) rotates about the e3-axis. 


E= Le = |x’|? = 1, because u is the arc length on the meridian, F = Bell neeee| (the 
Ou Ou Ou Ov 
) 
meridians and circles of latitude cut orthogonally), and finally G = — ‘ = = &%(u); the first 


fundamental form is therefore 
ds? = du? + &(u) dv?. 
To calculate the second fundamental form one needs the second derivatives of the parametric 


representation 
07x - 07x ; 07x 
du? "gu dv 7 Qu? 


The quantity x, is called the relative curvature of the 
meridian. Obviously, |x,| = (x’’- x’’)1/?, because n:n = 1, 
that is, the value of x, is equal to the curvature of the 
generating curve; #, is positive when the curve curves in 
the direction of the normal vector nm, and negative when it 
curves in the opposite direction; in the figure, x, < 0. By 
scalar multiplication with the normal vector it follows that 


L==2,(u), M=0, N = &n’, 
so that the second fundamental form is 


oy = %,(U) (s+) + &u) 1’) (<2). 


ele 


ds 


From this one can easily calculate the normal curvature of the meridians and circles of latitude. 
For a meridian v = vg, dv = O and du = ds from the first fundamental form. It follows that 
%n(Mer) == %r; the normal curvature of a meridian is equal to its relative curvature. For a circle of 
latitude u = ug, au = 0, that is, ds* = &? dv? from the first fundamental form. It follows that 


26.1-16 Curvature of a surface of revolution 


%nLat) = 1 /&. These formulae can be interpreted geometrically. Since x’ = <* is a unit vector, 


x’ = fe + nex = ecosp + e3 sing, where @ is the angle that x’ makes with e. From the figure 
one sees immediately that /a = sing = n’, so that ’/& = 1/a is the reciprocal of the length a of 
the segment PD of the normal from the point P of the surface to the axis of rotation. It can be shown 
that the calculated values are just the maximum and minimum of the normal curvature for arbitrary 
curves of the surface through P. The extreme values of the normal curvature are called the principal 
curvatures of the surface at P. Curves which at every point have one of the principal curvatures as 
their normal curvature are called lines of curvature. In general, through each regular point of a sur- 
face there pass two lines of curvature, which cut at right angles; in the case of a surface of revolution, 
these are just the meridians and the circles of latitude. For the Gaussian curvature one obtains 
immediately K = (LN — M?)/(EG — F?) = x,én//&? = x,/a. 


The Gaussian curvature is the product of the principal curvatures. 


It follows from this that for a surface of revolution K < 0 when the curve arches towards the axis 
(x, < 0), and K > 0 when the curve arches away from the axis (x, > 0). 


572 26. Differential geometry, convex bodies, integral geometry 


Mean curvature and minimal surfaces. If by a suitable choice of an orthonormal basis in the tangent 
plane at P the second fundamental form of the surface is brought to the normal form 
Mn, = Ay (=) +A, (=) , then one obtains the principal curvatures 4,(u, v) and A,(u, v) as impor- 
tant invariants of the surface. A point at which the two normal curvatures coincide and are non- 
zero is called an umbilic of the surface. On a sphere every point is an umbilic; conversely, a surface 
on which every point is an umbilic is part of a sphere. If both principal curvatures are equal to zero, 
one speaks of a flat point; a surface on which every point is a flat point is part of a plane. At an 
umbilic or a flat point the normal curvature does not depend on the direction of the curve. The 
elementary symmetric functions of the principal curvatures are the Gaussian curvature K = 1,A, 
and the mean curvature H = (A, + A2)/2 of the surface. The problem of drawing a surface through 
a simple closed curve in space, which can be regarded as a continuously deformed circle, such that 
the surface has the least possible area, has as a necessary condition the equation H = 0, which was 
discovered as early as 1760 by LAGRANGE (1736-1813) and is a special case of the Euler-Ostro- 
gradskii differential equation (see Chapter 38.). The non-planar solutions of this equation are 
called minimal surfaces. Since it follows from H = 0 and K + 0 that K< 0, minimal surfaces have 
negative Gaussian curvature. Further global results can only be mentioned. 


The only closed surfaces with constant Gaussian curvature are spheres. 

If K > 0 always holds for a closed regular surface, then it is an ovoid, that is, it bounds a finite 
convex body. 

The only regular closed surfaces of genus zero with constant mean curvature are spheres. 


Klein’s Erlangen programme 


According to Felix KLEIN (1849-1925), the various geometries are to be regarded as invariant 
theories of the corresponding groups of transformations. Thus, the Euclidean differential geometry 
considered so far — as a branch of Euclidean geometry — is the theory of invariants of curved surfaces 
and curves under the group of Euclidean motions (or transformations), which can be imagined as 
motions of rigid bodies. In a similar way, affine geometry is the theory of the invariants under 
affine transformations (parallel projections), and projective geometry studies properties that remain 
invariant under general projections (central projections). For example, the correspondence between 
the points of a curve arid the osculating planes is not only a Euclidean, but a projective invariant, 
while arc-length, curvature and torsion are only Euclidean invariants, and not even affine. Indeed, 
a circle, whose curvature is constant, can be transformed by an affine transformation into an arbitrary 
ellipse, whose curvature is no longer constant. 

To every geometrical space having a Lie group of transformations there belongs, as a branch of 
the geometry of this space, the corresponding differential geometry. Today, apart from Euclidean 
differential geometry, also affine, projective, elliptic, hyperbolic differential geometry, etc. have 
been developed. The properties that are invariant under a group G of transformations are a fortiori 
invariant under a subgroup of G; for example, it turns out that the classification of the points of a 
surface into elliptic, hyperbolic and parabolic is not only a Euclidean, but also an affine and even a 
projective invariant. 

In addition, in differential geometry the interesting properties must be invariant under differentiable 
parameter transformations. Quite generally, one can ask what properties of geometrical forms remain 
invariant under sufficiently differentiable mappings of the space onto itself. The property of being 
a Straight line or a plane :is obviously invariant under projective mappings, but a plane can be 
transformed into a fairly arbitrarily curved surface by a suitable differentiable mapping. Orders 
of contact of curves or surfaces are invariant under sufficiently differentiable mappings. Also, the 
geometry of webs is invariant under these mappings. Questions about properties invariant under 
differentiable mappings can be very fruitful, even though the set of these mappings no longer forms 
a group, in general. 

These considerations lead to a classification of the properties of differential geometry according 
to the principles of Klein’s Erlangen programme. One can consider, for example, all twice continuously 
differentiable one-to-one mappings of surfaces of E; onto surfaces of E;. The intrinsic geometry 
was defined above as the theory of properties of surfaces that remain invariant under isometric 
mappings. Here the set of isometric mappings is a proper subset of the set of differentiable mappings 
of surfaces onto one another. 

A mapping is called conformal if angles between curves remain invariant; for example, stereographic 
projection of a sphere onto a plane is a conformal mapping. Every isometric mapping is conformal, 
but not conversely. The properties that remain invariant under conformal mappings form the subject 
matter of the conformal geometry of the surface. Similarly one can consider area-preserving and 
other classes of mappings and their corresponding geometries. 


26.1 Differential geometry 573 


Riemannian geometry 


Manifolds. All the geometrical figures studied in differential geometry are regarded as point sets, 
referred to parameters or coordinates. The dimension of a figure is defined as the number of co- 
ordinates that are necessary to fix a point of it. Thus, a curve is one-dimensional; for its points can 
be characterized by the values of one parameter ¢. Correspondingly, a surface is two-dimensional, 
and the space surrounding us is three-dimensional. Spaces of higher dimension occur in physical 
and technical applications. For example, if one wishes to describe the path of an aeroplane, that is, 
not only its route, but also its progress in time, one must know at any time f¢ its longitude uy, its 
latitude v and its height 4. One thus obtains a curve in the four-dimensional space of the variables 
t, u, v, h. If one wishes to follow the aeroplane more precisely, one must add the components of 
dui dv, dh 
—_ , v= A= 
dt dt dt 
dimensional space of the variables ¢, u, v, h, u, b, h. If one now considers N aeroplanes simultaneously, 
one must specify, apart from the time ¢, for each aeroplane the 6 position and velocity coordinates 
U;,U0;,h;, U;,0;, h; (i = 1, ..., N), in order to describe the ‘system’, here the aggregate of N aeroplanes; 
one obtains therefore a (6N + 1)-dimensional space. In statistical mechanics an ideal gas is represented 
as a system of N molecules, which move in space independently of one another (like our aeroplanes); 
to describe the gas, again 6N + 1 coordinates are necessary. Here N is very large, of the order of 
magnitude of the Loschmidt number: N = 6.02: 1073. All these examples have one thing in com- 
mon: each of the spaces is a point set whose points are in one-to-one correspondence with n-tuples 
(x,,---, X,) of real numbers, the coordinates of the point. Such a point set is called an n-dimensional 
differentiable manifold. These are manifolds in which the coordinates can — like the parameters of 
curves and surfaces — again be subjected to sufficiently differentiable coordinate transformations. 
The only properties that have a geometrical meaning are those that do not depend on the choice 
of coordinate system. Since the real numbers are a mathematical model of our intuitive geometrical 
idea of the continuum (number line), it is not surprising that important geometrical considerations 
can be carried out in a differentiable manifold; for example, the idea of tangent vector and tangent 
space can be introduced, and a theory of contact of submanifolds (curves, surfaces, m-dimensional 
submanifolds of the given n-dimensional manifold) can be developed. On these foundations one can 
then build a geometrical theory of partial differential equations. Also, a geometrical theory of the 
calculus of variations, the so-called Finsler geometry, can be created. 


instantaneous velocity uv = , and thus obtains a curve in the seven- 


Riemannian geometry. Although a geometry can be developed for differentiable manifolds, it is 
in comparison with Euclidean geometry, very meagre, because concepts such as length, angle, area 
parallel displacement and curvature are completely missing. In 1854, Bernhard RIEMANN (1826-1866 
in his inaugural lecture ‘Uber die Hypothesen, welche der Geometrie zugrunde liegen’ (On the 
hypotheses which underlie geometry) developed the basic ideas of a geometry, which much later 
found an important physical application as the mathematical foundation of Einstein’s general 
theory of relativity. An n-dimensional manifold is called a Riemannian space if a quadratic differential 
form is given in it as element of arc: 


= D2! Big(X1, -++5 Xn) dx; dx,. 
i,k=1 


The simplest non-trivial special case of a Riemannian geometry is the intrinsic geometry of a surface, 
which is determined by its element of arc (first fundamental form) alone and does not depend on 
the actual embedding in Euclidean space. The first fundamental form goes over into the form given 
above for the element of arc of a 2-dimensional Riemannian space if one introduces the new notation 
uU=X,,0= x2, E= 811, F = &12 = 821, G = £22. Here x; and x2 must not be confused with the 
space coordinates x;, x2, x3 in £3. It does not matter which of the mutually isometric surfaces in 
E; one considers. 

Riemannian geometry is precisely a generalization of the intrinsic geometry of a surface into n 
dimensions. All the concepts mentioned above, which are missing from the theory of differentiable 
manifolds, can be defined by means of the element of arc as a meaningful analogy to intrinsic 
geometry. For example, if x, = x,(t), 0<( t<( 1, is a representation of a curve in Riemannian 
space, then along it dx; = x, dt, and one obtains again as invariant parameter s = s(t) the arc 
length 


sO) = JVI E uli, ETO) EROEAO 


an 
While in intrinsic geometry the form » 9;,%;X, is always positive definite, in Riemannian geometry 
,k=1 
also indefinite forms are admitted, so that the arc length can occasionally be zero or imaginary. 
It is just such Riemannian spaces that are applied in the theory of relativity. 


574 26. Differential geometry, convex bodies, integral geometry 


Also, Euclidean space is a special case of a Riemannian space; its element of arc in orthonormal 
Cartesian coordinates was g;; = 1 and g,, = 0 for i = k. One can say that a portion of a Riemannian 
space arises by distorting a portion of a Euclidean space of the same dimension, in the same way as, 
for example, a piece of car bodywork is formed from a flat piece of metal by cutting out and pressing. 
In a similar way, manifolds with affine connection arise by ‘distorting’ affine spaces: in these manifolds, 
length, angle and area are no longer defined, but only a parallel displacement, depending on the path, 
determines the geometry of the manifold. The curvature of a Riemannian space (and also of a manifold 
with affine connection) indicates the deviation of the geometry of the space from that of the Euclidean 
(or affine) space of the same dimension: it is measured by means of the Riemann-Christoffel curvature 
tensor. 


26.2. Convex bodies 


A body B in Euclidean space is called convex, or an ovoid, if the line-segment joining any two 
points of B lies in B. Convex bodies have been investigated in geometry for a long time. A proper 
theory of convex bodies arose towards the end of the 19th century in the works of BRUNN 
and MINKOWSKI (1864-1909); it has been generalized to n-dimensional Euclidean space and 
to non-Euclidean spaces. Examples for convex bodies are the sphere, ellipsoid, cylinder, cone, 
cube, tetrahedron and rectangular parallelepiped. The last three bodies are convex polyhedra, that is, 
bodies whose boundary consists of finitely many convex polygons. In the theory of convex polyhedra 
the following questions, for example, have been considered: By how many conditions (on the edges, 
vertices, faces, area, and so on) is a convex polyhedron uniquely determined (to within motions)? - 
When does there exist a convex polyhedron with certain prescribed conditions? 

In the theory of convex bodies extremal problems are treated frequently. The oldest of these is 
the isoperimetric problem (see Chapter 38.). 

By a convex surface one understands the boundary of a convex body. A convex surface can have 
faces and vertices, as in the example of a convex polyhedron. Nevertheless, the most important 
results of the differential geometry of regular surfaces of Euclidean space, especially their intrinsic 
geometry, can be generalized to arbitrary convex surfaces. In this way more far-reaching results 
than in classical differential geometry have been obtained. An essential feature of the theory of 
convex bodies is that one operates directly with geometrical objects, points, lines, and so on, and 
avoids to a large extent the dependence on analytical methods such as coordinates or parametric 
representations. Recently a modern branch of geometry, the very general geometry of sets, has 
been built up on the foundation of these methods. 

The theory of convex bodies finds applications in many other branches of mathematics. For 
example, the geometry of numbers, whose foundations were also laid by MINKOWSKI, was developed 
in conjunction with this theory; certain results of the theory of convex bodies were applied to prob- 
lems of number theory. Connected with this is the very attractive and intuitive theory of packing. 
A typical problem in this theory is the following: How should one place pennies on a very large 
table so as to accomodate as many pennies as possible? — The pennies must not overlap. The solution 
is that each penny must touch six others. The analogous problem about the densest packing of 
spheres in space is still unsolved. The theory of convex bodies also has many connections with integral 
geometry. 


26.3. Integral geometry 


Integral geometry developed from problems in geometrical probability. The first such problem 
was posed in 1777 by Count George DE BUFFON (1707-1788) (Buffon’s needle problem): Parallel 
lines are drawn on a plane at equal distances a apart. A needle of length / < a is thrown at random 
onto the plane. What is the probability p that it should meet one of the lines? — The answer is that 
p = 2l/(ma). Since / and a are known, and p can be estimated by statistical methods, this gives a 
possibility of (approximately) determining 2 experimentally. 

Subsequently many similar examples were considered, and important partial results obtained. 
Wilhelm BLASCHKE (1885-1962) and his school were the first to found integral geometry as a proper 
geometrical discipline. The foundations of integral geometry of a geometrical space are certain 
measures, which are assigned invariantly to sets of geometrical objects; in the Euclidean plane, for 
example, with every elementary plane figure one associates its area. This measure is invariant, 
that is, congruent figures have the same area. Every figure can be regarded as a set of geometrical 


27.1. Combinatorial analysis 575 


objects, namely as the set of points belonging to it. The dual geometrical objects to points in a plane 
are the lines. There now arises the obvious question whether one can define in an invariant way 
a measure for the set of lines. In this case it is possible. For example, one considers the set LZ of all 
lines / that meet a circle C of radius r. The position of such a line is fixed if its direction angle p 
(the angle it makes with a fixed line, say the x-axis) and its distance p from the centre of the circle 
are given. Here must run through all directions in the interval 0 < mw < 27 and p must run through 
all distances 0 < p <r. As a measure for the set of lines L one therefore chooses the product of the 
lengths of the two intervals. It can be shown that this measure is invariant and can be defined in a 
similar way for much more general sets of lines. In the given example, the measure for the set L 
is 27r, which is the circumference of the circle. More generally, the measure of all lines that meet a 
convex plane figure is equal to the circumference of the figure. 
If one considers two convex figures that do not overlap, one 
can ask for the measure of all the lines that simultaneously 
meet both figures. M. W. CROFTON deduced that this measure 
is equal to the length of the crossed belt that encircles the two 
figures, minus the length of the uncrossed belt around the two 
figures (Fig.). : ; 

Apart from the measures of sets of lines one can define a eer “COI On Seeletieorn 
kinematic measure for sets of mutually congruent figures; for 
example, one can calculate the measure of all equilateral triangles of side 1 that meet the given 
figure. The kinematic measure was introduced by Henri PoINCARE (1854-1912). 

Just as in the plane, so one can develop a more or less meaningful integral geometry in other 
geometries determined by Lie groups of transformation (in the sense of Klein’s Erlangen programme). 
In particular, the integral geometry of n-dimensional Euclidean space has been studied extensively. 
Also, for curved surfaces and for the Finsler geometry of the calculus of variations, attempts have 
been made to set up an integral geometry. 

Integral geometry finds applications in other branches of geometry, particularly in the theory 
of convex bodies. Also, in practical problems its methods have been applied in connection with 
mathematical statistics ; for example, a method has been worked out for the statistical determination 
of the surface of a lung. 


27. Probability theory and statistics 


27.1. Combinatorial analysis ........... 575 Limit theorems for sums oy indepen- 
Permutations .....0.0.0 00 cc ceca 576 dent random variables ............ 593 
Combinations. ...... 0.0 cece eee 577 Stochastic Processes .......000008. 594 

27.2. Probability theory ............... 578 27.3. Statistics ........... ccc eee eee 595 
Probability of random events ...... 578 Design of experiments .........44. 595 
Random variables and distribution .. 582 The collection and evaluation of ma- 
Mean value or expectation and vari- VOI one eetrciveee bata iawe ewe 596 
GCE anc Geheer cece dca state aes 583 Regression and correlation......... 599 
Chebyshev’s inequality ............ 585 Methods of statistical estimation ... 600 
The law of large numbers ......... 586 Statistical testing procedures ....... 601 
Some important distributions ...... 586 Fields of application of statistics ... 605 


27.1. Combinatorial analysis 


Combinatorial analysis investigates the different possibilities for the arrangement of objects, 
for example, the questions: ‘How many ways are there of arranging four letters in a row?’, or ‘In 
how many different ways can five different numbers be selected from 90 numbers?’ — The objects 
of the investigation can be, for instance, numbers, letters, persons, tests. They are called elements 
and are denoted by numerals or letters. If two arrangements do not contain the same elements, 
orr example, ab, cd, or if they do contain the same elements but not the same number of times for 
each, for example, aab, abb, then they are regarded as different. Arrangements such as aabb, abab 
fae regarded as different only if the ordering is taken into account. 


576 27. Probability theory and statistics 


Permutations 


Every arrangement of finitely many elements in any order in which all the elements are used is 
called a permutation of the given elements. For example, the arrangements acdbe, dbcae are per- 
mutations of the elements a, b, c, d, e. 


Number of permutations. The number of permutations of distinct elements can be obtained by 
inductive reasoning as follows: From two elements a and b the two permutations ab, ba can be 
formed. Of the three elements a, b, c, each can stand in the first place, and at the same time the other 
two can be ordered in two different ways: 

abc acb bac bca cab cba. 
It follows that there are 3-2 = 6 permutations. For four elements the corresponding argument 


leads to 4-3-2 = 24 permutations. In general, n elements can be arranged in 1-2-3:----(n—1) n=n! 
(read n factorial) ways. 


There are n! permutations of n distinct elements 


Example: From the nine digits 1, 2, 3, ....9 one can form 9! = 362880 nine-figure numbers 
such that in every one each of the nine digits occurs only once. 


Table of factorials from 1! to 20! 


n 

1 720 39 916 800 20 922 789 888 000 
2 5 040 479 001 600 355 687 428 096 000 
3 40 320 6 227 020 800 6 402 373 705 728 000 
4 362 880 87 178 291 200 121 645 100 408 832 000 
5 3 628 800 1 307 674 368 000 20 | 2 432 902 008 176 640 000 


If groups of identical elements occur in a number of elements, then the number of permutations 
is smaller than if all the elements are distinct. For example, in permuting the five elements e; = a, 
€, = a, e3; = b, eg = b, es = b, all orderings of the elements e, and e2, and similarly all those of 
the elements e;, e, and e, must be regarded as identical. Since the numbers of permutations of these 


' 
elements are 2!, 3! repectively, the total number of distinct permutations is only _ = 10. In 


general, if n elements consist of m groups containing p;, P2, .--, Pm identical elements, respectively, 
and if the p;! permutations of the elements a 
p; (i= 1,2, ...,m) are regarded as being the 
same, then the total number of permutations 
of the n elements is given by: 


Example: Could a passionate bridge player play all possible hands in the course of his life? - 
' 
The number of possible hands is Sc EIBEIBEIBELE , because permutations of the 13 cards of each 


of the players do not lead to a different hand. This number exceeds 5.36 - 107°. If he played 200 
hands every day for 100 years, that is, 7 300000 hands, the player could play only a tiny fraction 
of the total. 

Lexicographical ordering. The search for all n! permutations of 7 elements is greatly simplified 
by lexicographical ordering. In this one first fixes some natural order, for example, for numbers 
their order of magnitude, or for letters their alphabetical order. Permutations are then said to be 
in lexicographical order if of two different permutations the one whose first element is in its natural 
order comes first; if they have the same first element, then that with the second element in its natural 
order comes first, and if the first two elements are the same for both, then one distinguishes them 
according to the third element, and so on. The first two of the following pairs of permutations are 
in lexicographical order, as are also the six permutations of the three elements given above, but the 
third pair is not. 
l.abcfe ek tere a ere 

abcgf acbih abdef 


Inversion. Two elements in a permutation are said to form an inversion if their ordering is opposite 
to their natural order. In the permutation cdbea formed from the elements a, b, c, d, e, the element 
c precedes b, c precedes a, d precedes 5, d precedes a, b precedes a and e precedes a, and they each 
form an inversion. The given permutation thus contains six inversions. If the number of inversions 
of a permutation is even, the permutation is said to be even; otherwise it is called odd. 


27.1. Combinatorial analysis 577 


Combinations 


Every collection of k elements taken from n elements is called a combination of the kth class or 
kth order. 


Example: ab, ac, ad, be, bd, ed, bb, dd, ... are combinations of the second class from the four 
elements a, b, c, d. 

If only distinct elements are chosen for each collection, one speaks of combinations without 
repetition; otherwise they are called combinations with repetition. If two combinations containing 
the same elements, but in a different order, are regarded as being distinct, one speaks of permutations 
in an extended sense. 


The number of permutations. The number of permutations of k elements selected from n elements 
will be denoted by "P,. The first element can be chosen in n different ways, the second can then be 
chosen in (n — 1) ways, the third in (n — 2) ways, and the Ath in (n -- k + 1) ways. Hence 

n! 


(n—k)! © 


"P, = n(n—1)...(n—k4+1) = 


Thus, for n = 4, the number of permutations of the 4 elements a, b, c, d taken 3 at a time is 
+P, = 4-3-2 = 24. These are 
abe —e abd —e ach —p acd —e adh —g- ade —e bac —» bad —» bca—e bed —» 
bda — bde —e cab —s cad —m cha —e chd—e cda —» cdb —e dab —» dac —e 
dba —e- dbc —e dca —- dcb. 

If repetitions are allowed, then the second element can also be chosen in n ways, and so can the 
third, and so on. Thus, for = 3 elements, the number of permutations taken 2 at a time with 
repetition is >P{) = 37 = 9, These are 

a0 —pe ab — pe aC —e ba —e bb —e bo —e CO — Cb — CC. 
More generally, "P{? = n*. 


— 


Example 1; From the digits 1, 2,...,9,0one can form '°P{) = 10° = 1000 permutations of 
the third class with repetition. These are precisely the numbers 000, 001, 002, up to 999. 
Example 2: By means of Braille the blind are able to feel 


letters, numbers and punctuation, whichareprintedonpaper @@® @e5° @0 @0 @0 #0 @e 
as arrangements of six points that appear either raisedor @c @e o@8 @0© @0 O@ oc 
non-raised. Point and no point are the two variable ele- @ o eo e 's) @co00e0 


ments, and the number of possible signs represented bythem r 0 b / P ~ 
is the number of permutations with repetition of these two 

elements taken 6 ata time. This gives 2° = 64 signs. These 27.1-1 Braille for ‘problem’ 
possibilities are enough to represent the blind alphabet, 

together with numbers and punctuation (Fig.). 


The number of combinations. In combinations the order in which the elements occur is not taken 
into account. The number of combinations of & elements taken from n elements is denoted by "C,. 
From the four elements a, 5, c, d the following combinations of the second class can be formed: 
ab —eac —e ad —e bc —» bd cd. If one permutes the elements of each combination, one 
obtains the number of permutations *P, of two elements taken from four, which is 2! times as large. 
Similarly, the number of permutations "P, of k elements taken from 7 elements is k! times the number 
of combinations "C,. Thus, 


n\ _ n! 
a (n—k)! k! 


Example: In the game of Bingo k = 5 different numbers can be chosen from m = 90 numbers 
in ( ” = 43 949 268 ways. Only if one has cards with all these possibilities is one certain to have 


n 


k!-"C, = "Py = : 


ror "Cy = ( 


n! 
Gh! ; The ( 


] are the binomial coefficients. 


5 


578 27. Probability theory and statistics 


a line of five. The number of fours and threes can be calculated in a similar way. From the five 


(correct) numbers one is always missing; thus, there are (3) = 5 combinations of four. For three 


correct numbers, with two always missing, there are ; = 10 combinations. Each of the five 
combinations of four occurs | = 85 times among the possible incorrect lines of five, 
since this is the number of possible ways of adding a fifth number from the remaining 85 incorrect 
ones. Thus, the number of lines with four correct numbers is ( : )( si . ? = ( : )( 
= 5+ 85 = 425. Each combination of three numbers can be combined with two of the remaining 
numbers in ( . ) = 85-42 ways so that neither a line of four correct nor one of five correct 


results. Thus, the number of lines with three correct numbers is ( ; . ( i ) = 35 700. 


2 2 
If repetitions are allowed, the number of combinations of ” elements of the Ath class is denoted 
by "C{”. For the three elements a, b, c and for the ” elements a,, a2, ..., @, one obtains the fol- 
lowing combinations of the second class: 
aa ab ac G;G; Q@,Qaz G;Q3...a,a, 
bb be Q2Q2 203... a24,, 
cc 4303... 43a, 
Andy 
1 
Thus, *C$?=3+2+1=6 and "CM=n+(n—1)+(@—2)+--4+1 =("5 ). In 
k—1 ; 
general, "C{? = be k ; this statement is true for = 2 as has been shown above, and can 


be established for every natural number k > 2 by the method of mathematical induction. 


27.2. Probability theory 


Historical background. The beginning of probability theory goes back to the middle of the 17th 
century. An enthusiastic gambler, the Chevalier DE MERE, asked Blaise PASCAL (1623-1662) to solve 
a problem that was important for him: to describe the distribution of the wins of two gamblers, 
given that at an intermediate point in the game one had won n < m rounds and the other p < m, 
and that it had originally been decided that the first one to win m rounds should win the whole 
game. PASCAL communicated his solution to Pierre FERMAT (1601-1665), who also found a method 
of solution. A third one came from Christian HUYGENS (1629-1695). These learned men recognized 
the significance of the question for the investigation of the laws governing random events. The 
concepts and the first methods of the new science developed from problems of games of chance. 
Much later, in the 19th century, the rapidly increasing interest in natural sciences made it necessary 
to extend the theory of probability beyond the framework of games of chance. This development 
is closely linked with the names of Jacob BERNOULLI (1654-1705), Abraham de Motvre (1667-1754), 
Pierre-Simon de LAPLACE (1749-1827), Carl Friedrich Gauss (1777-1855), Simon Denis PoIsson 
(1781-1840), Pafnuti Lvovich CHEByYSHEV (1821-1894), Andrei Andreevich Markov (1856-1922), 
and most recently with those of Alexander Yakovlevich KHINCHINE (1894-1959) and Andrei Niko- 
laevich KOLMOGOROV (b. 1903). Connected with the investigation of the laws governing random 
events is that of mass events. For example, the production of an article in everyday use is a mass 
event and the occurrence of a faulty article among them is a random event. Probability theory 
today is connected with many other branches of mathematics and with many fields of natural 
science, technology, and economics. 


Probability of random events 


Event. An event £, in the sense of a random event, is the result of a trial that can, but need not 
occur. A trial can be an observation or an experiment and is characterized by a set of conditions 
to be satisfied and by its repeatability. The limiting cases are also regarded as events: the certain 
event S, which definitely occurs, and the impossible event @, which never occurs. In the trial ‘throw 


27.2. Probability theory 579 


of a die’, for example, E; is the event ‘3 turns up’, S is ‘1 or 2 or 3 or 4 or 5 or 6 turns up’, and 
‘7 turns up’ is @. 

Events are mutually exclusive or incompatible if as the result of a trial only one of them can occur; 
for example, in the trial ‘throw of a die’ the events E; (i turns up), i = 1, 2, 3, 4, 5, 6, are mutually 
exclusive, since only one of them can occur. In drawing a ball out of an urn that contains red and 
black balls, E, ‘drawing a red ball’ and E, ‘drawing a black ball’ are incompatible, since they cannot 
occur simultaneously. All events that are mutually exclusive in pairs form a complete system of 
events if, as the result of a trial, one of them must necessarily occur, for example, the events 
E,; (i = 1, 2, 3, 4, 5, 6) for the trial ‘throw of a die’. 

If two events E, and E, form a complete system of events, each of them is complementary to the 
other. In tossing a coin, for example, ‘heads’ and ‘tails’ are complementary. 

One speaks of the sum C of the two events A and B, denoted by C= Av BorC=A-+B, if 
in a trial at least one of the events A or B occurs. This idea can be extended to more than two events. 
For example, for the trial ‘throw of a die’ the event ‘an even number turns up’ is equivalent to the 
sum of the events FE, + E,-+ Eg. 

One speaks of the product C of the events A and B, denoted by C= A-Bor C=A4A:B (or 
simply C = AB) if in a trial the events A and B both occur. For the trial ‘throw of two dice’, for 
example, the event C, ‘throw of 12’, occurs when a 6 is thrown with each die. The idea of a product 
can also be extended to more than two events. 


The classical definition of probability. Although there exists an axiomatic theory of probability, 
the important laws can be derived from the classical definition. 


Thus, P(E) is always a number between 0 and 1; 0 < P(E) < 1. For the sure event S, P(S) = 1. 
The probability of an event is often expressed as a percentage. 

For the trial ‘throw of a die’ the events £, (i turns up) with i = 1, 2, 3, 4, 5, 6 are all possible. 
If the throw of a 3 is regarded as favourable, then the probability of its occurrence is P(E) = 1/6. 
This assumes an ideal die that is geometrically and mechanically homogeneous, so that no side is 
favoured more than another on account of the form and mass distribution. The events £; are then 
equally probable. They form a complete system, and it follows that their sum is the certain event 
of throwing one of the numbers 1 to 6. But the probability of this is 1, and therefore the probability 
of each £; is 1/6. For the event E= E, + E,-+ E, (an even number turns up) one obtains 
P(E) = 3/6 = 1/2. 

The addition law in probability theory. If as the result of a trial m events are possible, and if m; 

k 


of these are favourable to the occurrence of the seven E, for i= 1,2,...,k, then m= D' m, are 
i=] 


favourable for the occurrence of the event E = ¥ E;, provided that the events E; (i = 1, 2,..., k) 
i=1 

are mutually exclusive. It nn follows me P(E;) = m,/n for i= 1,2,...,k, and P(E) = as 

= (m, + m2z+- +m)/n = Bs m,/n = > P(E;). 


The probability of the sum of a iaikes = mutually exclusive events is equal to the sum of the 
probabilities of these events P(E, +- E, +- --: + Ey) = P(E,) + P(E2) + --: + PCE). 


Example: Asa result of throwing an ideal die, let E, 
and Es be the events of 4 and 5 turning up. Then for 
E= E,+ E,;(a4ora5 turns up) it follows that P(E) 
= P(E, + Es) = P(Es) + P(Es) = 2/6 = 1/3 (Fig.). 


If in a trial only the k events E; with i= 1,2,...,k are 
possible, then they form a complete system of events; since 
n=m=m,+m2,+-::+m in this case, it follows 
that P(E) = 1. For two mutually exclusive events E, and 
E,, P(E; + E2)= P(E) + P(E2) = 1, or P(E2) = 1 — P(E); es ; 
for example, if E, is the event ‘birth of a boy” and E, the 27:2-! The addition law for an ideal die 
event ‘birth of a girl’. 

If in a trial with m possible outcomes the event E, occurs m, times and the event E, occurs m2 
times, but these two events are not mutually exclusive, the addition theorem can still be used if 
one considers the / cases when the event E, E, occurs. There are three groups of mutually exclusive 


580 27. Probability theory and statistics 


events: 771, = (m, — /) cases in which only the event E,; occurs, m2 = (m,— /) cases with E, alone, 
and / cases in which both occur. By the addition law it therefore follows that 

P(E, + E2) = (m, — D/n + (mm, — D/n + In = m|n + m2/n — I[n = P(E,) + P(E2) — P(E E2). 
If P(E, E,) is not known, the estimate P(E, + E,) < P(E,) + P(E2) holds. This addition theorem 
can be extended to more than two events. 


Conditional probabilities. An unconditional probability depends only upon the set of conditions 
initially fixed for the trial; for example, that every die used is ideal, so that each number is equally 
likely at each throw. A conditional probability depends, in addition, on at least one further condition. 
The probability of the occurrence of the event E on the assumption that the event F has already 
occurred is then denoted by P(E/F). 


Example J: If an urn contains n balls, of which m are black and (” — m) white, then for the 
trial ‘drawing with replacement’ two events are possible, F, ‘drawing a black ball’ and F; ‘draw- 
ing a white ball’. For these two one obtains the unconditional probabilities P(F,) and P(F,). For 
the trial ‘drawing without replacement’ the number of balls available for the second withdrawal 
depends upon the outcome of the first one. If the event F, occurred, then there are (m — 1) black 
and (m — m) white balls in the urn; if F, occurred, there are m black and (n — m — 1) white balls. 
For the events E£, ‘a black ball at the second withdrawal’ and £; ‘a white ball at the second with- 
drawal’ there are therefore four conditional probabilities P(E,/F,), P(E2/F,), P(E,/F2) and 
P(E2/F2). 


Event first withdrawal 


Probability 


(E3/F,) (E,/F:) (E2/F3) second withdrawal 
m—laAa-—m m n—m— 1 


n—1 a-—l n—1 n— 1 


Example 2: In throwing two ideal dice 36 events E, ,= (a,b) with a= 1,2,...,6 and 
b = 1,2, ..., 6 occur, and they are mutually exclusive in pairs. In this notation, as with ordered 
number pairs in general, (a, 6) is to be regarded as different from (5, a). One therefore obtains the 
unconditional probability P(E, ,) = 1/36. For the event ‘total score 8 at one throw’ the addition 
theorem gives P(a + 6 = 8) = 5/36, because there are five favourable events corresponding to 
a+b=2+6=34+5=4+4=5+3=6+ 2. Under the additional condition S, ‘total 
score is even’ one obtains on the other hand P(E, ,/S,) = 1/18 and P(a + b = 8/S,) = 5/18. 
The condition ‘score of 4 with the second 
die* gives another conditional probability. 
For b = 4, (a, 6) can be one of the six 
number pairs (1, 4), (2, 4), (3, 4), (4, 4), 
(5,4), (6,4), so that P(E, ,/(6 = 4)) 
= 6/36 = 1/6. The same numerical value 
1/6 is obtained for the probability that 
out of the six number pairs with a = 4, 
the one with b = 4 occurs (Fig.). 


Event 


Probability 


27.2-2 Probabilities in throwing two dice 
a) 5/36 for total score 8 


Fresh see 
a second die b second die b) 6/36 = 1/6 for score of 4 with second die 


Multiplication law for probabilities. If in a trial with n possible results the event F occurs &k times, 
so that P(F) = k/n, and if m of these & events F satisfy a further condition under which the event 
(E/F) occurs, then the probability for this event is P(E/F) = m/k. On the other hand, the fraction 
m/n represents P(EF), since the product EF denotes an event that satisfies both the conditions for 
F to occur and also those for E. But m/n = k/n- m/k, and it follows that P(EF) = P(F)- P(E/F). 

The probability P(EF) for the simultaneous occurrence of two events E and F is the product of 
the probability P(F) of the first event F and the conditional probability P(E/F) of the event E under 
the assumption that the event F has already occurred. 


Example: When two dice are thrown, the 36 events £, , are possible. Let F be the event ‘score 
(a, b) such that a + 6 is divisible by 2" and (E/F) the events among these for which a +- 6 is also 
divisible by 3. Consequently EF are the events for which a + 6 is divisible by 6. One obtains 
P(F) = 18/36. P(E/F) = 6/18 and P(EF)= 6/36 = 1/6, because EF consists precisely of the 


27.2. Probability theory 581 


6 events (1, 5), (2, 4), (3, 3), (4, 2), (5, 1), and (6, 6). Alternatively, P(EF) = 6/36 = (18/36) » (6/18) 
= P(F): P(E/F). 


The law of total probability. If the events F; for i = 1, 2,..., n form a complete system and if E 
is a further event, then the events EF; for i= 1,..., are mutually exclusive in pairs. The sum 
of these events is equivalent to the event E. By the addition law for probabilities the probability 
P(E) is given by the sum of the P(EF;,) for i = 1, 2, ..., m. But for each term in the sum one obtains 
P(EF,) = P(F;) P(E/F;), by the multiplication law for probabilities. Thus, combining these results, 

n 


P(E) = DS P(F;) P(E/F;). This unconditional probability P(E) is called the total probability. 
i=l 


Example: Two urns contain black and white balls in proportions that can be different for each 
urn. Then the event ‘drawing a white ball from one of the two urns’ can be represented as the sum 
of two mutually exclusive events EF, + EF,, where F, denotes ‘drawing a ball from urn 1" 
and F, ‘drawing a ball from urn 2°. By the law of total probability the probability for the event E 
is given by P(E) = P(EF,) + P(EF,) = P(F,) P(E/F,) + P(F2) P(E/F2). Thus, P(E) can be 
calculated from P(F,), P(F2), P(E/F,) and P(E/F). 


Independent events. Two events E and F are independent of one another if the occurrence or 
non-occurrence of one has no influence on the occurrence or non-occurrence of the other; for 
example, in throwing two ideal dice the score thrown with one does not depend on that thrown 
with the other. If £ denotes ‘a score of 4’ and the dice are distinguished by the indices 1 and 2, 
then P(E,) = P(E,) = 1/6, but P(E,/E,) = 1/6 also. For the event FE, ~ E, = E,- E,, that is, 
‘a score of 4 with each die’, it then follows that P(E; ~ E,) = 1/36 = 1/6: 1/6. This result can be 
generalized. 


The axiomatic definition of probability. The development of the natural sciences and technology 
led to problems to which the classical definition of probability could no longer be applied uncritically. 
One cannot always assume that the number of possible cases is finite and that the individual cases 
are equally likely. For example, it is difficult, purely from arguments of symmetry, to determine 
the probability that during a specified time interval m conversations out of a total of will be taking 
place over a telephone cable. The statistical definition of probability is superior to the classical one 
for these problems, but its character is more descriptive than formal-mathematical. It became 
necessary to investigate systematically the basic concepts of the calculus of probability and to establish 
the conditions for its applicability in an axiomatic structure. Of the different approaches that were 
proposed, the one generally followed today is that which KoLMoGoROv developed at the beginning 
of the 1930’s for the solution of the new problems. He linked the concept of. probability with 
modern set theory, measure theory, and functional analysis. His method proceeds from the main 
properties of probability that are valid whether they are based on the classical or the statistical 
definition. KOLMOGOROV created an axiomatic foundation for the concept of probability, which 
includes both the classical and the statistical definition and, in addition, satisfies the more stringent 
requirements of modern natural science and technology. 

This axiomatic development is based on a set S of elementary events and a system B of subsets 
of S. The elements of the system B, that is, the subsets of S, are called random events. If the system B 
of random events satisfies the following conditions, then it is called a Borel field of events. 


Borel field: 

1. The set S is an element of B. | 

2. If two sets E, and E, are elements of B, then their union E, ~ E,, their intersection E, ~ £3, 
and their complements £, and F, are also elements of B. 

3. If sets E,, E,...,£,,... are elements of B, then their union FE, ~ E,~ +. v E,~--- and 
their intersection E, ~ E, > --- - E, > +> are also elements of B. 


27.2-4 27.2-5 
27.2-3 Event Event 
Event E,~ E, E, o £, 
E, and and event and event 
event Ey a £y a. £y 


582 27. Probability theorie and statistics 


If only conditions 1. and 2. are satisfied, one speaks of a field of events. 

By the second condition 5, that is, the empty set @, must be an element of B. It is called the im- 
possible event. The random events E,, £,, FE, v £2, Ey v £2, E; ~ Ez, and E, ~ E, are illustrated 
in the figures, in which random events are represented by points of a square. Each point set then 
represents a random event. 

The explanation will be helped by an example. If a die is thrown, then the set of elementary 
events consists of the six elements e, (i = 1, 2, ..., 6), where e; denotes that a throw results in a 
score i. The system of random events B then consists of 2° = 64 elements: (e;), (€2), ---» (€6), (€1, €2), 
osiny (és, €6), (€1,€2, e3), very (Cas Cs, 6), eeey (e; » 2, €3, C4, C5, €6) and the empty set @. Enclosed 
in each pair of brackets stand the elements of S of which the corresponding subset of S is composed. 

On the basis of the system B of random events, in which S denotes the certain event, S the impos- 
sible event and E and E opposite events, the probability of the occurrence of an event is defined 
by Kolmogorov’s system of axioms. 


Kolmogorov's system of axioms: 

1. Axiom: To every random event E in the field of events there is assigned a non-negative real number 
P(E), called the probability of E. 

2. Axiom: The probability of the certain event S is 1, P(S) = 1. 

3. Axiom: If the events E,, E,, ..., E, are mutually exclusive in pairs, then P(E, ~ E, ~--- © E,) 
= P(E,) + P(E) +--- + PC(E,). | 


The system is supplemented by the following extended addition axiom, which makes it possible 
to take account of those events (of frequent occurrence in probability theory) that are composed 
of infinitely many partial events. 


Extended addition axiom: If the occurrence of an event E is equivalent to the occurrence of an 
arbitrary one of the events £,, £>,...,£,,..., mutually exclusive in pairs, then P(E) = P(E,) 
+P(E,) + --- + P(E,) +-:- 


From these axioms it follows as a first consequence that P(E) < 1 for every event E in B. The 
axiom system is free from contradictions but not complete. The structure of probability theory 
is based on it. The measure-theoretical concept of probability, in conjunction with a sufficiently 
wide interpretation of frequency, is the basis of mathematical statistics. 


Random variables and distribution 


A random variable X is a variable which, in different experiments carried out under the same 
conditions, assumes different values x, each of which then represents a random event. In what 
follows only discrete random variables X will be considered, which assume a finite number or at 
most a countably infinite number of values x;, or continuous random variables X, which can take 
all values in a finite or infinite interval. The numerical value of the score obtained by throwing a 
die is a discrete random variable; the random events, or realizations, are x; = i for i= 1, 2, ..., 6. 
On the other hand, the instantaneous speed X of a molecule in a gas is a continuous random variable, 
which can take every value in an interval. Random variables are completely characterized by their 
probability, density, and distribution functions. 


Probability and distribution function of a discrete random variable. The random events x; are 
regarded as discontinuities and their probabilities P(x;) as magnitudes of the discontinuities. The 
probability function then relates the magnitudes to the discontinuities; for example, for a loaded 
die the probabilities PLY = x;) correspond to the events x; ‘score of i’. Since one of these events 


6 
must occur, 3’ P(X = x;) = 1. P(X=x.) 
i=1 . 


a. 


es 8 oo ee ‘s A M (8 | L 6 te bg 4 

Graphically one represents the probability function by 
vertical strips, rectangles of width e, and of height 
ey’ P(X = x;). For a loaded die the values P(X = x;) are 
given in the table (Fig.). If one chooses e, and e, for the 
units in the abscissa and ordinate directions, so that e, = e, = 1, 
then the sum of the areas of the rectangles is >)’ P(X = x;) 
=]. The distribution function F(x) gives the probability 
P(X < x) that the random variable X assumes only values 7.2.6 Graphical representation of 
x;< x. Thus, for the random variable X with realizations the probability function of a discrete 
i= 1,2,...,n, (0X < x) =0 for x< x,, P(X < x)= P(x) | random variable 


27.2. Probability theory 583 


for x; <x < x2, P(X < x) = P(x) + P(x2) for x2 << x < x3, and P(X¥< x)=1 for x>-x,. 
Hence the distribution function F(x) increases monotonically from F(—co) = 0 to F(+0oo) = 1. 
For the given probability function of a loaded die, for example, because x; = i one obtains: 


Fd) = P(X¥<1)=0; FQ) = P(X< 2) = PX = 1) = 1/6; 
F(3) = P(X < 3) = PX = 1) 4+ P(X = 2) = 1/3; 


F4)=PX<4)= P3 P(X = i) = 11/24; 


F(x) 


F(5) = P(X <5) = S P(X = i) = 5/8; 
F(6) = P(X < 6) = 5 PX = i) = 3/4, 
fu 


Fx >6)=P(X< x= SPX =i) = 1. 


27.2-7 Graphical representation of the distribution function F(x) 
of a discrete random variable 0.7 23. 4 5 6 


The graph of the distribution function F(x) is a step function (Fig.), where the x; are taken as 
the abscissae and the corresponding F(x) as the ordinates in a Cartesian coordinate system. 


Density and distribution function of a continuous random variable X. For a continuous random 
variable X every value x in an interval is a random event for which the probability of its occurrence 
is zero. But to each value x there corresponds a value f(x) of the probability density or density function. 


For this functionf(x) = Oand f f(t) dt = 1 (Fig.); that is, the area between the x-axis and the graph 
of f(x) is of magnitude 1. =00 


F(x) 


x 


27.2-8 Representation of a density function 


27.2-9 Distribution function of a continuous random variable x, X, ¥ 


The distribution function F(x) again represents the probability P(X < x) that the continuous 
random variable assumes values less than x. Because f(t) >0 it therefore follows from F(x) 


x 
= P(X < x)= J f(t) dt that the function F(x) increases monotonically from F(—co) = 0 to 


— OO 
F(+-00) = 1 (Fig.). 
The probability that the random variable X assumes a value in the interval x; < x < x2, given 


by Py <X< x) = P(X < x2) — PIX < x1) = F(x2) — F(x;) = f' f(O dt, is represented by the 


x 
blue area in the graph of the density function. A well-known continuous random variable X is the 
normal distribution, which will be investigated later. 


Mean value or expectation and variance 


A random variable is completely described by the probability function if it is discrete, and by the 
density function if it is continuous. From these functions parameters can be calculated that charac- 
terize the random variable. The most important are the mean value or expectation and the variance. 


Mean value or expectation. The mean value u of a discrete random variable X is obtained by multi- 
plying each of its possible values by the corresponding probability and forming the sum of all 
these products. This mean value need not occur among the values of the discrete random variable X. 

If X is described by the probability function P(¥ = x;) = p; (i = 1, 2, ..., m), then the mean value 
fis determined by w = x, py + X2p2 + °°: + XnDn- 


584 27. Probability theory and statistics 


Example 1; For an ideal die, p;= 1/6, i= 1, 2, ..., 6; thus, the mean value is « = 1 - 1/6 +- 2- 1/6 


+3-1/6+ 4:-1/6+ 5-1/6+ 6-1/6=3.5. 


For a continuous random variable X the mean value 4 is calculated by multiplying x by the.density 
function f(x) and integrating the product from —oo to +00. 


Example 2: It is required to find the mean value of the continuous random variable Y with the 
probability density of the normal or Gaussian distribution (Fig.) 
p(x) = 1/[a /(27)] e825, 
+00 
Then «p = f[ x/[a V(27)] e~@-b"/Ga4 dy, 


By means of the substitution (x — 4)/(a 2) = z, 
dx/(a 2) = dz and with the help of the integral 


+05 +0 
formulae [ e~*’dx=j|/2and f{ xe-*" dx =0 one obtains 
=00 


=i) 


w= ay2/yn fle + ba y2))e-* dz 


27.2-10 Mean value of a normal 
distribution 


+00 
= blyx f edz =b. 


Mean values of sums and products of random variables. The sum Z = X + Y of the random 
variables X and Y with the mean values u, and #, is also a random variable; for example, the value 
of the score obtained by throwing two dice. 

The mean value u, of the random variable Z is determined from the mean values u, and yw, of the 
single random variables X and Y. 


The mean value of the product of two independent random variables is equal [ = Hey | 


to the product of the mean values of the two variables. 


This rule can also be carried over to the mean value of the sum of three or more random variables. 
For example, the random variable Z = U + X-+ Y formed from the sum of the random variables 
U, X, Y with mean values u,, “,, 4, has the mean value uw, = mM, + My + My. 


Example 3: If two ideal dice are thrown, then the random variables Y and Y are the values of 
the score obtained with the first die and second die, respectively. The mean value of each is known: 
A, = et fase eg The mean value of the values of the total score thrown with both dice is then 
Bw, = 3.5+3.5 =7. 

Example 4: The output of a factory in a small unit of time, say in one day, can be regarded as 
a random variable, because it is subject to small variations due to disturbances that cannot always 
be predicted and also cannot be eliminated by technical means. In two factories A and B the number 
of articles (random variables ¥ in A and Y in B) has the mean value wu, = 260 in A and nw, = 90 
in B. Then the production in both factories (random variable Z) has the mean value 

Ht, = 260 + 90 = 350. 


Likewise for the product of two random variables X and Y a simple rule holds if X and Y are 
independent, that is, if the equation 
PX<K X,Y y= PX< x): P(IVY< y) 


holds for arbitrary x and y. Here P(X < x; Y < y) is the probability that X is less than x and Y 
less than y. If uw, and mw, are the mean values of the two independent random variables X and Y, 
then the mean value yu, of the random variable Z = X- Y is given by uw, = Uy * My. 


The mean value of the product of two independent random variables is equal 
to the product of the mean values of the two variables. 


Just as for the sum of more than two random variables, so this rule can be carried over to the 
product of more than two independent random variables. 


27.2. Probability theory 585 


Example 5; In the making of rectangular plates both the length (in inches) and the width (in 
inches) are random variables (X and Y, respectively). It follows that the area is also a random 
variable (Z = X- Y). Let the mean values of X and Y be w, = 120in. and wv, = 80 in. Then 
the mean value yu, of the area is u, = 120 - 80 = 9600 in’. 


Variance. In many cases the mean value does not suffice to characterize a random variable X. 
In the production of bolts, for example, the diameter is an important measurement and is a random 
variable X in the sense of probability theory. With the best adjustment of the machine its mean 
value 44 is equal to the desired value. During production, however, it is found that many diameters 
are greater and many are smaller than the desired value. For the same mean value the deviation 
can be large with one machine and small with another, but they must lie within the tolerance limits. 

The variance o? of the random variable X is used to describe them, and its square root a is called 
the standard deviation or root mean square deviation. It is a measure of the magnitudes of the devia- 
tions from the mean value. 


The variance a? of a discrete random variable X is obtained by multiplying the square of each 
deviation (x, — 4) from the mean value by the corresponding probability and adding all these 
products. 

The same probability function applies to the magnitudes (x, — 4)? as to X, namely 


n 


XiX2°*° Xn _ 
Ait2*" with Sp, = 1. | 
re eae i : 


Example 6: When an ideal die is thrown, the score x, |] 2 3 4 #5 6 
obtained is a random variable X with mean value u : 
= 3.5 and probability function Pr 1/6 1/6 1/6 1/6 1/6 1/6 
Hence the variance is given by: 

a? = (1 — 3.5)? + 1/6 + (2 — 3.5)? - 1/6 + (3 — 3.5)? - 1/6 + (4 — 3.5)* - 1/6 
+ (5 — 3.5)? - 1/6 + (6 — 3.5)? -1/6 = 2.92 and o= 2.92 = 1.71. 


The variance co? of a continuous random variable X is obtained by multiplying the square of the 
deviation (x — 4) from the mean value by the density function f(x) and integrating the product from 
—o fo +00, 


Example 7: It is required to find the variance of the continuous random variable X with mean 
value 6 and probability density 


p(x) = 1/[a V(22)] e-@-)"/@2 ~— (Gaussian or normal distribution). 
+90 
The definition of the variance gives: 7 = f (x — b)*/[a //(22)] e~@-®"/@" dx. 
-oo 


Using the substitution given above one obtains the result o? = a?. 


Variance of the sum of two independent random variables. If X and Y are two independent random 
variables with variance 02 and o?, respectively, then Z = X + Y is also a random variable (for 
example, the score obtained by throwing two dice) and the variance o? of the random variable Z 
is determined by the variances of the two variables X and Y. 


The variance of the sum of two independent random variables is equal to the 
sum of their variances. 


Example 8: The score obtained by throwing two ideal dice is a random variable Z. It is the sum 
of the random variables Y and Y, these being the separate scores obtained with each of the two 
dice at each throw. The variance of each of these is ¢2 = o? = 2.92. Thus, the variance o? of the 
random variable Z is given by of = 2.92 +- 2.92 = 5.84, 


Chebyshev’s inequality 


In the preceding section it was shown that a general idea of the behaviour of a random variable 
can be obtained from the mean value and the variance. However, when these are given, it is still 
not possible to answer the question, what is the probability for a particular deviation from the mean 
value u. Chebyshev’s inequality gives a simple estimate for this. 


586 27. Probability theory and statistics 


One begins with a discrete or continuous random variable X with 
values x, mean value yw, and variance a7. Chebyshev’s inequality, 
which will not be derived here, is then as follows: 


The probability that the absolute value of the difference (x — m) is greater than or equal to an 

arbitrary number ¢ > 0 is less than or equal to the variance divided by e?. 

With the help of this inequality the probabilities for the different deviations from the mean value 
can be estimated. If, for example, in measuring a length, a mean length of 300 yd. and a variance 
of 36 is established, then the probability that a deviation of more than 30 yd. occurs is estimated 
as follows: P(|x — 300| > 30) < 36/900 = 0.04; that is, the probability is at most 0.04. 


The law of large numbers 


In everyday life and in theoretical investigations events whose probabilities lie close to one or to 
zero play an important role. One is interested, for example, that the probability for the safe transport 
of passengers shall be hardly distinguishable from one, or that the probability for the collapse of 
a bridge shall be practically zero. Thus, it is an essential task of probability theory to find instances 
whose probabilities lie near to one. Among these laws, the Jaw of large numbers is of particular 
significance and will be explained in two forms. 

The law according to CHEBYSHEV. Given are n pairwise independent random variables 
X,, Xz, ..., X, with mean values “,, 2, .--, 4, and variances that are all less than 67. Let 
A = (M + M2 +--+: + M,)/n denote the arithmetic mean of the mean values. From Chebyshev’s 


inequality it follows that P(|(1/7) x X,— A| <&) > 1 — b?/(ne”), where « is an arbitrary positive 
number. i=1 
The law of large numbers according to Chebyshev. For sufficiently large values of n the arithmetic 
mean A of the mean values of nm pairwise independent random variables differs from the arithmetic 
mean of these variables by less than e¢ with a probability that is arbitrarily close to 1. 


Chebyshev’s law of large numbers justifies the rule that the mean value of m measurements is 
more reliable than any single measurement. 

The law according to Jakob BERNOULLI: The probability for the occurrence of an event E is p, 
and in m independent trials the event E occurs m, times. Then for an arbitrary positive number e, 
P(|ny/n — p| <€) > 1 — 1/(4e2n). 


The law of large numbers according to Bernoulli. For sufficiently large values of m, the relative 
frequency of the occurrence of the event E in n observations differs from the probability p for the 
occurrence of the event by less than e, with a probability that is arbitrarily close to 1. 


Bernoulli’s law of large numbers is a special case of Chebyshev’s law. It establishes that the relative 
frequency can be used for the estimation of unknown probabilities. 


n n, heads | m,/n 1 — 1/(4e2n) _Example: In tossing a coin one 

/ r distinguishes between two events 

Buffon 4040 | 2048 0.507 | 0.9938 heads and tails. The result of seve- 
K. Pearson | 12000 6019 0.501 6| 0.9979 ral long series of tosses are to be 
K. Pearson 24 000 12012 0.500 5 | 0.9990 given, using the law of large num- 


bers with e = 0.1. One sees that 
as the number of trials increases, 
the probability that the relative 


Some important distributions frequency m,/n deviates from the 
. . probability p = 0.5 by less than 
A random variable X is completely characterized by its 0.1, tends to 1. 


probability function or density function, or by its distri- 


bution function; for short one says: by its distribution. Some types of distribution have acquired 
great significance in practice. 


The binomial distribution. The binomial distribution (sometimes called the Bernoulli distribution) 
can be used for problems based on the following scheme of trials: There are black and white balls 
in an urn, and the probability for the event E ‘drawing a black ball’ is p. The probability for the 
event £, ‘drawing a white ball’, is then (1 — p). The trial, that in a series of n draws with replacement 
the event E occurs k times and fails to occur (m — k) times, determines the random variable X. 
Its distribution is the binomial distribution. Its probability function P,(k) can be determined as fol- 
lows: Since each ball is replaced after being drawn, every draw is an independent event, and by the 
multiplication law the probability of drawing k black and (” — k) white balls is p*(1 — p)"~*. 


27.2. Probability theory 587 


n!} 
~ ki (n— ky)! 
white balls can be drawn, each of which leads to the same result of the trial. Thus, P,(k) 


There are ( ‘ different permutations of the sequence in which k black and (n — k) 


= (7) p* (1 — p)""*. The probabilities that belong to the individual values of k give the probability 


function. By summing over all k < x one obtains the distribution function F,(x) = 2 P,,(k). 


Example 1: A tadl:jo candi at, st ax ais ka Gis eanaee ae ee Vat 
drawing a black ball is p = 1/4. Each 10 such draws form a If the trials are continued, 
the number of black balls in the individual groups will vary: it is a random variable. It is required 
to find the probability function and the distribution function of this random variable. The ‘number 


colsceqnpan wegen gregh on ote With the help of the formula Pyo(k) = ( ) (=) (3) 
where k = 0, 1, ..., 10, one obtains the probability function (Fig. 27.2-11): — \ 


4/\4 


O f2 3 sé §5§ 6 7 8 9 W 0 7 5 >, #§ 0 


Cy 
A 


27.2-11 Probability function P,.(k) of a binomial 27.2-12 Representation of the distribution func- 
distribution tion Fyo(x) 


and from Fio(x) = 2 Pio(k), where the summation extends over all k < x, the distribution 
function (Fig. 27.2-12): 


One obtains an impression of the probability function and the distribution function from graphical 
pe honey In these & and x are taken as abscissae, P,o(k) and F,o(x), respectively, as 
ates 
Example 2: From the probability p = 0.515 of a male birth, one can calculate with the rie 
ee ena as distribution the probability that a family of, say, 6 children contains 0, 1, 2, 3, 4, 5 
or ys: 


In order tosimplify the tedious — 
calculation of the probabilities 
P,(k), one derives the recursion 
formula: 
Mean value and variance of the binomial distribution. Substituting the corresponding magnitudes 


in the formula for the mean value of a discrete random variable one obtains 
2 z n "1 (n— I 1 
w= E mP(m) = > m ( ) pm — py" = np ( ) pm — pyr-tom, 
m=0 m=0 m m=0 m 


When the binomial theorem is observed, it follows that « — np[p + (1 — p)]""! = np. 


588 27. Probability theory and statistics 


With the mean value np, the variance is then given by 


0? = S(m — np)? P,(m) 


= 5 mi(? ) pm — py-m — Inp Em ( ” m1 — yr-m 4. 42 Pee m1 ) waa 
m=0 m m=0 m - P P m=0\™m P P 

= Lm? ( ) oma — py" — n?p?. 

m=0 m 

If the first sum is calculated in the same way as for the mean value, it follows finally that 

o? = pn{(1 — p) + pn] — n?p? = np(1 — p). 
For the Example 2 above the mean value and variance are given by 

w=10-1/4=2.5; oo? = 10: (1/4) - (3/4) = 1.875. 


The binomial distribution is easy to use for small values of m and k. For large values, however, 
the calculations become very troublesome and either the Poisson distribution or the normal distribu- 
tion is used, according to the nature of the problem. 
row number of 
0 pins o 


Galton’s board. A representation of the binomial di- 
stribution is possible with Galton’s board. This is an 
inclined pin board. The pins are driven in in such a way 
that the distance between every pair of adjacent pins is 
divided in the ratio p : (1 — p) by the pin lying above them 
(Fig.). Balls are allowed to run out of a funnel through 
the rows of pins. Each ball hits a nail from row to row 
and can be deflected to the right or to the left. After the 
balls have run through n rows, they are caught in (7 + 1) 
compartments. The contents of the compartments show 
the distribution of the balls in the form of a histogram. 
If N balls are allowed to roll through a Galton board with 


nrows of pins, then N- P,(m) balls are to be expected Cte & 26 & ow 


e . . e . ° Cor artment number 
in the mth compartment. Various binomial distribu- siiiesiailiamatii 


tions can be demonstrated by various arrangements of 27.2-13 Schematic representation of a 
the rows of pins. Galton board 


Poisson distribution. This distribution is based on essentially the same problem as the binomial 
distribution. It differs only in that the number 7 of balls drawn from the urn is very large and the 
probability p for the drawing of a black ball is very small. In other words: the Poisson distribution 
is the limiting case of the binomial distribution as n + oo and p — 0, with the additional assumption 
that the product np = a is constant. The distribution is applied, therefore, if an event occurs very 
rarely. With these assumptions, taking the limit lim P,(k), the probability of drawing k black balls 
in n draws is given by: n—> 00 


y(k) = ake*/k}, 
where np = a. The Poisson distribution is determined by the quantity a alone. The probabilities 


for the individual values of k give the probability function, and summing the individual probabilities 
over all k < x one obtains the distribution function 


F,(x) = 2 Pr(k). 


Example; One ball is drawn from an urn repeatedly and replaced into the urn. The probability 
of drawing a black ball is p = 0.01. Each 60 such draws form a group. If the trials are continued, 
the number of black balls in the individual groups-will vary; it is a random variable. The probability 
function and the distribution function of this random variable ‘number of black balls among 
60 balls’ are required. From weo(k) = (0.6)* e~°-*/k!, where a = 60- 0.01 = 0.6 and & can take 


yrlk| 


27.2. Probability theory 589 
the values 0, 1, 2, 3, ..., 60, one obtains the probability function 
k 


Veolk) 
and from Feo(x) = Y weo(k), where the sum extends over all A < x, the distribution function: 
k 


60 
Feo(x) 
One obtains graphical representations in the same way as for the binomial distribution. 


In Fig. 27.2-14 Poisson distributions with different values of a are 
drawn. It shows that the peak of the distribution for increasing values 
of a moves further to the right and reduces the asymmetry of the 
curve. 

For the calculation of individual probabilities one uses the appropriate 
recursion formula obtained from the ratio 


Pnlk + 1/ynlk) = [kla*t*e-*}/[(k + 1)! ake-*] = af(k + 1). 


a=! - 
Mean value and variance of the Poisson distribution. These are calcu- 
lated from the corresponding quantities for the binomial distribution, 
012549 k in which np is put equal to a and p tends to 0. One obtains u = np = a; 
Prk) eT o? = np = a. Thus, the mean value and the variance are equal. 
01234656789 k 
yn (hk) a=5 


OI2Z36567890 i2 ke 


27.2-14 Poisson distribution for different values of a 


The field of application of the Poisson distribution was at one time limited to very rare events, 


for example, to child suicides or to deaths from horse kicks in an army. In recent decades, however, 
it has acquired considerable importance. Today it plays an important role, for example, in telecom- 
munications, in statistical quality control, for the description of the decay of radioactive substances, 
in the textile industry, in biology, and in meteorology. In addition, the Poisson distribution is used 
in many cases as an approximation for the binomial distribution, since in practice the agreement is 
adequate for sufficiently large values of m and small values of p. 


The normal or Gaussian distribution. The normal distribution is one of the most important dis- 


tributions of probability theory and was discovered by Gauss in connection with the application 
of the method of least squares to surveying. 


Pe (k) Pas (k) Pig (ki) pix) 
Le] 
a3 02 
naj) 
Q2 
07 
Q7 
ke | k 
0712365 12365678907 2468 Oi2 & 6 8 2 


27.2-15 Graphical representation of the probability function for an increasing number of draws n 


590 27. Probability theorie and statistics 


For the binomial distribution the probability function P,(k) = (i) p*(1 — p)"-* was given, 


where the probability p for the event ‘drawing a black ball’ has a fixed value between 0 and 1. For 
an increasing number n of draws (see the binomial distribution) the probability function loses its 
asymmetry (Fig.). For p = 0.2 and n = 5, 15 and 30 one obtains 


k 0 ] 2 3 4 5 
P5(k) 0.08 | 0.26 | 0.34 | 0.23 | 0.08 | 0.01 
k Z 3 4 5 6 7 8 9 10 11 


Py 5(k) 0.02 | 0.06 | 0.13 | 0.19 | 0.21 | 0.18 | 0.12 | 0.06 | 0.02 | 0.01 


As n-» oo, when the number of draws 7 increases beyond every limit, one obtains the normal 
distribution. The number of characteristic values x of the random variable X is no longer countable; 
the random variable X is continuous. Its density function p(x) is obtained by the limit passage 
lim P,(k) = p(x) = 1/[a V(27)] e~@-5"/24, as can be shown in greater detail. The numbers 


n— CO 
a and 6 are constants. For the mean value yu and the variance o? one obtains 


lb = "fx [a /(22)] e~ &-5)*/2a dx = b, 0? =x — b)?/[a /(22)] - e- 2-9/2 dx = a*, (see Mean 


value or expectation and variance, Examples 2 and 7). 
The mean value and the variance describe this distribution completely. 

Fig. 27.2-16 shows the graphical representation of the frequency distribution p(x) for different 
values of a? = o?. The curves are bell-shaped. The peak of each distribution lies at the mean value p. 
The curve falls from this value symmetrically on both sides of it and approaches the x-axis asymp- 
totically. The curve has inflections at distances +-o from the mean value. The influence on the form 
of the bell-shaped curve of the magnitude of the variance is easy to recognize in the figure. With 
increasing o the curves become flatter and broader. 


27.2-16 Frequency distribution of the Gaussian distribution for different o 


Normalized Gaussian distribution. It is tedious to calculate individual values of the density function 
p(x) of a random variable X that can be described by a Gaussian distribution with mean value u 
and variance o?. One therefore relates each Gaussian distribution to one with mean value u = 0 
and variance o? = 1, whose density function g(A) = 1/)/(27) - e~4"/2; this is called the normalized 
Gaussian distribution, or normal distribution, for short. The density function (A) is tabulated, and 
because of the symmetry of the curve only the values corresponding to positive characteristic values 
are given in the table. The transformation from a Gaussian distribution with mean value uw and 
variance o2 to a normalized distribution is achieved by means of the substitution A = (x — w)/o. 
For a calculated value of A one looks up in the table the corresponding value of the density function 
g(A) and then finds from the relation p(x) = ¢(A)/o the value of the density function p(x) corresponding 
to the characteristic value ~. 


Example: The mean value » = 20 and the variance o*? = 25 of a Gaussian distribution are 
ae os calculates A = (x — 20)/5, then looks up (A) in the table and finally calculates 
(x) = gA)/5. 


27.2. Probability theory 591 


Distribution function of the Gaussian distribution. The distribution function of the Gaussian 
distribution 


F(x) = f p(t) dt = 1/fa VQa)] f e-6-P 122 ay 


is called the Gaussian or error integral. 


592 27. Probability theory and statistics 


It represents the area under the curve p(x) between —oo and x (Fig.). 
The function F(x) has the x-axis and the line F(x) = 1 as asymptotes and 
has a point of inflection at x = uw. It gives the probability that a charac- 
teristic value is less than x. Bearing in 
mind the symmetry of the bell-shaped 
curve, the distribution function F(x) for 
4 = 0 and o? = 1 is tabulated in the 
following form: 


P(A) = fo dt = 1//(2n) fev dt. 


“5 -2 7092 #=*3 
feo bh peo “x A 
27.2-17 Representation of the 27.2-18 Geometric meaning of the 
density function and the distribution function #(A) 


function of the Gaussian distribution 


d (Ap)+O( Ay) 


b (A5)-O(A,) 


4 oe -t 8 FF 2 
3-2 -f Og! 24:3 , x rid 


27.2-19 Area given by #(A,) — #(A,) 27.2-20 Area given by #(A,) + #(A,) 


Fig. 27.2-18 shows the area represented by this function? Each Gaussian 
distribution with mean value yp and variance o? is connected with ®(A) by 
A= (x — p)/o. With the help of this relation the areas corresponding to 
the characteristic values x, and xz can be calculated. There are different 
cases to be considered: 

(i) If A, and A, both lie to the right of zero and A, > A,, then the area under the curve between 
these values is ®(A,) — ®(A,) (Fig.). A similar result holds if both values lie to the left of zero. 

(ii) If A, and A, lie on opposite sides of zero, then the area is given by ®(A,) + D(A2) (Fig.). 

In both cases the area calculated represents the probability with which an observed value is 
expected to lie in the interval bounded by x, and x2. 


Example: The mean value « = 20 and the variance oc? = 25 of a Gaussian distribution are 
known. It is required to find the probability that an observed value lies between 
(i) x; = 25 and x, = 35; (ii) x; = 5 and x, = 35. 
One calculates 
(i) Ay = (25 — 20)/S=1, 4, = (35 — 20)/S=3, (ii) Ay =(S5—20)5=—3, A, =(35—20)/5=3 
and obtains from the table 
(i) D(A,) = 0.3413, D(A.) = 0.4987, (ii) P(A, ) = 0.4987, D(A,) = 0.4987. 
Subtraction in (ij) and addition in (ii) gives the required probabilities (i) 0.1574; (ii) 0.9974. 


27.2. Probability theory 593 


With the help of the error inte- 
gral, the proportion of the whole 
area under the bell-shaped curve 
between the ‘limits x, and x2 can 
be determined. By subtraction 
from 1 the corresponding propor- 
tion of the remaining area is ob- 
tained. Important partial areas 
that have great significance in 
mathematical statistics are collect- 
ed together in the accompanying 
table. 


Limit theorems for sums of independent random variables 


Many processes in the natural sciences, technology and economics are described on the assump- 
tion that they are influenced by a large number of random factors independent of one another, 
of which each alters the course of the processes only a little. In general, only the sum of their effects 
is observed in investigating the processes; for example, the error in a measurement forms such a 
random variable which is the sum of many independent random variables. Probability theory has 
established limiting value theorems concerning the rules governing the behaviour of these sums. 


The de Moivre-Laplace integral limit theorem. If for each of 7 trials, p is the probability that the 
event E occurs and g = 1 — p the probability that E does not occur, then a random variable X, 
can be determined so that X, = 1 if E occurs in the kth trial and X, = 0 if E does not occur. The 


random variable ¥ X, then determines how often £ occurred in 7 consecutive trials. Because of the 
k=1 
distribution of the terms in the sum, the probability function of z X, is a binomial distribution with 


expected value 4 = np and variance o? = np(1 — p) = npq. By ihe de Moivre-Laplace theorem the 
distribution function of this sum 3’ X, does not tend to a limiting distribution function as n+ o, 
k 


but that for the random variable >’ SL em 
k=1 V("pq) 


normalized Gaussian distribution. This means that for arbitrary numbers a < b the following relation 
holds as 2 — oo, 


The de Moivre-Laplace theorem raises the question whether the relation obtained is dependent 
upon the choice of method of summation, and whether this relation is still valid if fewer conditions 
are placed on the distribution function of the terms of the sum. The central limit theorem gives a 
partial answer in its simplest form, which can be generalized considerably. 


The central limit theorem. If the pairwise independent random variables X,, X,, ...,.X, have the 
same distribution, and if « = E(X,) and o? = D?(X,) > 0 exist, then the random variable 


wah Fle x.) : | ; 
Am kel k=l has the normalized Gaussian distribution as limiting distribution func- 
D(—; ie x ) tion. 
k 


k=1 


does, and its limiting distribution function is the 


For a long time the main task of the classical side of this theory was to find the most general 
conditions under which the distribution function of sums of independent random variables would 
tend towards the normal distribution with increasing number of terms. Parallel with the conclusion 
of this classical side a further direction was developed in the theory of limit theorems for sums 
of independent random variables, that is closely bound up with the stochastic processes sketched 
later. The question in this direction is as follows: What distributions, other than the normal, can 
be limiting distributions of sums of independent random variables? — In these investigations it turned 
out that not only the normal distribution occurs as a limiting distribution. One considered the 
problem of finding conditions on the summands such that the distribution of the sum shall approach 
one or other limiting distribution for a sufficiently large number of terms. This modern aspect of 


594 27. Probability theorie and statistics 


the limit theorems for sums of independent random variables has been developed strongly in the 
last thirty years and is closely connected with the names of KOLMOGOROV, KHINCHINE, GNEDENKO 
and others. The limit theorems have practical significance, for example, in the development of 
mathematical statistics and in the theory of errors of observation. 


Stochastic processes 


Random or stochastic processes are described by random variables that depend on at least one 
parameter. Such a parameter can either assume only a discrete set of values or can vary continuously. 
The degree of wear of a car tyre, for example, depends on the number f¢ of miles it has been driven, 
but according to the initial conditions it is a random function of t. Also in the development of the 
number of inhabitants of a town over a lengthy period of time, besides the time ¢ as parameter, 
systematic and random influences must be taken into account. 

For the case of a parameter ¢ the stochastic process is denoted by X(t, w). In this the parameter w 
expresses the random nature, w € 92, where {2 is the set of all possible events that can occur; the 
parameter ¢ expresses a systematic dependence of the random variable, usually on the time, but it 
could also be dependence on a sequence in the case of a numbering, or on a distance. For a fixed f, 
for example, for a ‘snapshot’ taken at time ¢ = fg, X(t,q@) is a random variable. For a fixed w, 
X(t, @) is a function of ¢ that is called a realization of the process. For example, X(t, w) can give the 
number of individuals or parts, the temperatures or the velocity vectors, depending on time. 

Important types of stochastic processes are Markov and stationary processes. 


Markov processes and Markov chains. A Markov process or process without after-effect is a stochastic 
process in which the knowledge of the future development is completely determined by the present 
state. That is, if the distribution functions of the random variables X(fo, w), X(t;,@), ..., X(tm, @) 
at different points of time to < ty < --- << t,, are known, then the distribution function of the random 
variable X(t, w) at a point of time ¢ > ¢,, can be calculated from that at the time #,, alone. For example, 
suppose that a dam contains the quantity of water X(t,,, ) at the beginning ¢,, of the time interval 
(tm, tm + At) and that the amount Z(t,,,@) of water flows into the dam in this interval and the 
constant amount M flows out. Then the amount of water in the dam at time t= 4, + 4t is 
X(tm + At, @) = X(tm, @) + Z(tm,@) — M and the probability can be given that this amount of 
water does not exceed a definite capacity y of the dam. The amounts of water Z(t, w) flowing in 
are random, but are independent of one another for different points of time tf. 

In a Markov chain the parameter ¢ runs through only a discrete set of values ¢; with i=..., 
—l, 0, +1, nee 

Ergodic theorems are concerned with the properties of limits of Markov processes in which the 
parameter increases beyond all bounds. 

Poisson processes, a special class of Markov processes, play a role in the description of radioactive 
decay or in servicing processes, for example, in the work of a telephone exchange to calculate the 
waiting time for a call or for machine repair. 


Stationary processes. Stochastic processes in which the causes of the variations are independent 
of the time are called stationary. The local atmospheric temperature at a point in a room, for example, 
varies irregularly about a fixed mean value (Fig.) if one chooses the time interval of observation so 
small that the variations depending on the time of day can be neglected. The variation of the diameter 
of a thread drawn from a spinning jet also has a time-independent mean value. If one describes 
these processes by means of the function X(t, w), in which the variable ¢ is interpreted as the time, 
then for each ¢ there exist a mean value m and a variance o”. 


27.2-21 Curve showing local 
atmospheric temperature varia- 
tions 


Important in practice are those stochastic processes that are stationary in the sense of KHIN- 
CHINE. One postulates that their mean value m = E(X(t,w)) and variance 0? = D?(X(t,w)) assume 
constant finite values and that the correlation function R(t— s)= E({X(t, w) — m] [X(s, w) — m)) de- 
pends only on the difference t—s, where ¢t and s are two arbitrary points of time and t > s. These 
stationary processes are used, for example, in electrotechnology, in information technology, in 
the investigation of turbulent currents in the atmosphere, in the treatment of economic problems, 
and in medicine. 


27.3. Statistics 595 


27.3. Statistics 


The early beginnings of statistics are to be found in census counts before and around the begin- 
ning of the first century A. D. However, only in the 18th century did it begin to develop as an in- 
dependent scientific discipline, by serving to describe the features that characterize the condition 
of a state. The concept of statistics is derived from the Latin word status, meaning condition. For 
a long time it was limited to work in this field, and only in recent decades did it depart from this 
exclusive character and, with the help of probability theory, begin to work out methods of analysing 
statistical data and proving statistical hypotheses. The methods of this mathematical statistics — or 
simply statistics, for short, in what follows — became an effective tool in natural science and tech- 
nology by revealing new laws. 


Population and sample. The population of a statistical investigation has as its elements observations 
or experiments under the same conditions. Each element can be examined with respect to different 
characteristics, which can be regarded as random variables X, Y, ... If the characteristic X under 
consideration has the distribution function F(x) in the population, then one says that the population 
has the distribution F(x) with respect to the characteristic X. In statistical investigations one always 
considers a finite subset of elements from the population. It is called a sample, and the number of 
elements nm contained in it is called the sample size. 


Example: If the weight of 10-year old boys is the random variable X, then all boys of this age 
form the population. Measurements of the weights of boys in a number of places form a sample 
and each boy is an element of the population. The weight is a characteristic of the elements. Other 
characteristics, for example, are height and chest measurement. 


Design of experiments 


For working out a problem by statistical methods a plan of experiment must be set up that in- 
cludes the method of collecting the data, the size of the sample and the method of solution of the 
problem. The more thorough the planning of the experiment, the better will be the results obtained 
by the methods of statistics. In particular, it must ensure that no measurements that are important 
for the conclusions are omitted or are incomplete. But it can also avoid achieving with a very ex- 
pensive test series only as much as could have been achieved with an insignificant proportion of the 
cost. The following points are important in this connection. 


(i) The material investigated should be homogeneous; that is, during the investigation the method 
of testing must remain the same. No changes must be made in the apparatus or conditions of produc- 
tion, and measuring instruments of different precision must not be used. 


(ii) Systematic errors or influences must be excluded as far as possible. If one wishes, for example, 
to compare two materials, one must manufacture both on the same machine, otherwise differences 
in the machines enter into the results of the investigation. In agriculture, in testing different fertilizers 
the land must be divided into parallel strips in order to equalize the influence of the type of soil 
and its position. 


(iii) A control must be provided. Either standard values exist for the characteristic under con- 
sideration, which can be compared with the results of the test, or control tests must be carried out. 
In experiments with fertilizers, for example, one has to assess the influence of a fertilizer from the 
difference between plants that grew with and without it, but otherwise under the same environmental 
conditions. 


(iv) The choice of the sample must be random or representative. A random choice is one in which 
every element has the same probability of being or not being a member of the sample. In a consign- 
ment of screws, for example, the sample to be tested must not be chosen all from one place, but 
must be distributed over the whole consignment. In measuring the thickness of wires the measured 
points must be randomly distributed over the whole length of the wire. The random choice of elements 
can be made with the help of tables of random numbers. A representative choice of sample can be 
made when the material under investigation can be uniquely subdivided into parts. It is possible 
to subdivide a consignment of screws, for example, in such a way that each part contains the product 
of only one machine. Then from every part a number of pieces proportional to the size of the part 
can be chosen at random, and together these form the sample. In this way one obtains a picture of 
the consignment on a reduced scale. 


(v) With regard to the size of the test sample, one has to consider that the bigger it is, the better 
the deductions about the population that can be made from it; but on the other hand, for reasons 
of time and effort, the size must usually be kept small, so that one has to reckon with a random 
deviation of the results. When deductions are made about a population by statistical methods, 
the size of the test sample is taken into account. 


596 27. Probability theory and statistics 


The collection and evaluation of material 


The set of raw values that result from the experiments will be called the original population. They 
can be collected in lists, on charts, on bordered punched cards or on data processing cards, according 
to the size of the sample and the number of characteristics for each element. In the case of a single 
characteristic or a small sample one is content with a list; for several characteristics and larger 
sample size a chart or bordered punched card is prepared for each element in order to reduce the 
work of sorting in the evaluation process. Marks are punched at places on the border according 
to a predetermined key. In the case of many characteristics and a large number of elements, data 
processing cards are preferred for recording the values obtained, since the subsequent evaluation 
process is lightened by these preparations and this type of storing. 


The preparation of the material. In order to obtain a preliminary survey of a given material one 
orders the values of the characteristics in the original population according to their magnitude 
and determines how frequently each value occurs. A frequency distribution arises in this way. Both 
the continuous and the discrete random variables that were explained in the section on probability 
theory appear as discrete variables, because the values are rounded off by the given or required 
accuracy. 

Division into classes. In the case of a large sample the range of the characteristic values is sub- 
divided into classes of equal size; in this way several values are grouped together to form a class. 
The choice of the size of the individual ciasses depends on the size of the sample and on the scatter R, 
that is, the difference between the greatest and the smallest values in the sample. The number of 
classes must not be too small in order that the character of the distribution shall not be blurred. 
On the other hand, if the number of classes is too large, then abnormal values are exaggerated and 
the given distribution is difficult to recognize. A class is characterized either by its limits or by its 
mean value. The width d of the class is the difference between its upper and lower limits; the class 
mean xy,, in the case of characteristics that are described by discrete random variables, is the arith- 


metic mean of the characteristic values in the particular class, and in the case of characteristics 
that are described by continuous random variables, is the arithmetic mean of the upper and lower 
limits of the class. 


Example 1; Frequency distribution of a sample of size m = 80 for a characteristic that is described 
by a continuous random variable; x, is the characteristic value, A, the frequency. 
(i) without subdivision into classes 


x, hy | <i hy xy hy 
31,1 | 1 |;409 |i 3 43.8 | Il 2 
35.2 | | 411 jill 2 7439 7; 3 
36.6 | | 41.3 | Il 2 44.2 | Il 2 
37.2 | | 41.4 | I 1 | 44.3 | Il Z 
37.6 I 2 417 | 3 {| 44.7 [I ] 
37.9 | | 41.9 |i 3 44.9 | 1 | 
38.2 I 2 42.1 Wi) 64 45.2 | Il 2 
38.8 | Il 2 42.2 | Il 2 45.3 | | | 
39.0 | 1 1 42.5 | Il 2 45.5 | Il 2 
39.2 II l 42.6 | Il 2 456 |i! 2 
39,3 I 2 42.8 | Il 2 45.7 | il 3 
39.4 | l 42.9 | Il 2 45.8 | Il 2 
39.7 | ] 43.0 | | ] 45.9 || | 
40.1 | Il 2 43.2 || 1 | 474 |] 1 | 
40.3 | 1 | 43.5 | Il 2 | 47.8 | 1 | 
40.7 | 1 43.6 || ae 
(ii) with subdivision into classes; x,4, is the class mean 
Class 
From 33 up to and excluding 35 
From 35 up to and excluding 37 ll 2 
From 37 up to and excluding 39 +H III 8 
From 39 up to and excluding 41 HH HH III 13 
From 41 up to and excluding 43 HH HHH HHH HHH tH 25 
From 43 up to and excluding 45 | HH HH HH | 16 
From 45 up to and excluding 47 +H HH III 13 


From 47 up to and excluding 49 I Z 


27.3. Statistics 597 


Graphical representation of a frequency 
distribution. After preparing the material it 
is advisable to give a graphical representa- 

- tion of the empirical frequency distribution 
~ (Fig.). This can be done in different ways 
according to the purpose of the investigation 
and the characteristic considered, as can 
be seen from the examples. 


27.3-1 Representation of a distribution by a 
line diagram 


Mean value and variance of the . : 
sample. A sample of size n can be cha- 
racterized by the values of the mean xX (34° 


and the variance s?, which are consi- 
dered as estimates of the correspond- — 
ing values 4 and o? for the popula- 
tion. . ot r 
Mean value. The mean value, the nb +4 — 
arithmetic mean xX, is given by x= — _ 


1 
: = 1, 2, a =f BY ee 
- z x;, where x; (i n) are ait 


the individual measured characteristic 
values. In frequency distributions the 


fail Serene? 
6 40 


mean value is calculated by x = BS M6 


1 * 
er Pa h,x;, where h, are the frequen- —27,3-2 Represeniatlsa of a distribution by a block diagram 


cies, x; the characteristic values (or 
Xm, the class means) and & the number of characteristic values (or the number of classes). Besides 
the arithmetic mean X, the median value x is also used in practice as a mean value. For odd values 
of n it is the characteristic value that stands in the (7 + 1)/2th place in the series of values arranged 
in order of magnitude. For even values of the median x is the arithmetic mean of the characteristic 
values in the n/2th and the (7/2 + 1)th places. 
Variance. For the n individual values x; (i = 1, 2, ..., 2) of a sample the variance s? is given by 
1 n = n ] n 2 
aot > Su —9 =z [Et-T( Ex) |: 
s is called the bam ie deviation or a standard deviation. For a given frequency distribution 
with k characteristic values x; (or k classes with class means xy,) and frequencies h, the variance s? 
is calculated as oe 


ss? = r& hy: ba = x)? = 


1 


k 1 k 2 
1 B h,x? —( hx) |. 
iwl nN \i=1 


Besides the variance s* another quantity is used to characterize the range over which the charac- 
teristic values extend. This is the scatter or range of variation R, which is the difference between the 
greatest value x,,, and the smallest value x,,;, of the characteristic values: R = Xmax — Xmin- 

Example 2; For the frequency distribution of size n = 80 given above: 
Subdivision into classes Mean value | Variance 


Range of variation 


without 
with 


The discrepancies in the mean values and variance are due to the subdivision into classes for a 
comparatively small sample. For increasing n they become more and more nearly equal. 


Ci 


mulative frequency in % Example 3: Calculation of the normal distribution 
— = for the above sample (see Example /): 


99 


97 


598 27. Probability theory and statistics 


Normal distribution. Anthropological measurements of Lambert QUETELET (1796-1874) gave rise 
to the assumption that as far as their frequencies are concerned, all biological measurements follow 
a Gaussian distribution. For this reason it was called a normal distribution and the methods of 
statistics were built on this assumption. The basic features of the normal distribution were presented 
in the section on probability theory. For the sake of completeness a few examples that are important 
in statistical practice will be introduced in the following. 

Because the normal distribution is 
determined by its mean value and va- | bs 
riance alone, it can be calculated from BA iter elt tet! | 
the mean value ¥ and variance s? of a 0. fail | 
sample. In this way it is possible to iil | 
decide whether a particular characteris- | 
tic is based on such a distribution. LY) Gee | 

(i) If the sample consists of 7 charac- 55 | 
teristic values that are subdivided into - | 4 
k classes of width d with class mean eRESSS: HERS Epes CEtsd coos tena 
xm;> then one calculates for each class OF 
the number /; = (xm, — x)/s, in order ; 
to be able to use the normal distribu- as Geeraeend Gast 
tion table. With the help of the values eT 
g(A,) looked up in the table, one obtains 
the relative frequency q; = (d/s) 9(A;) 


for the ith class and the absolute fre- 27.3-3 Example of a frequency distribution with the normal 
quencies distribution drawn in 


ki, = nq Gi = 1, 2, wag). 


i de 


0.4 

2.1 

7.5 

40 16.4 
42 22.0 
44 18.3 
46 9,3 
48 3.0 
79.0 


27.3-4 Cumulative percentage curve of a frequency 
distribution 


(ii) If one is content with the graphical represent- 
ation of the normal distribution, one uses the fol- 
lowing rule of thumb: with the help of the formula 
q = (d/s) 9(A), as given in (i), the peak y,,,, of the normal distribution for y = 0 is calculated (Fig.). 
Additional ordinates can be found by the following rule: 


3h 36 38 40 = 4? éé 46 


If one wants to have the representation in absolute frequencies, every value is multiplied by n. 
In the figure the frequency distribution of the above example is shown with that of the normal 
distribution calculated according to this rule. 

(iii) It can also be tested with the help of probability paper whether the distribution of a charac- 
teristic under investigation fits a normal distribution and in addition what the mean value x and 
standard deviation s are. The probability net is coordinate paper whose ordinate scale is constructed 
so that the cumulative percentage curve of the normal distribution is a straight line (Fig.). 


27.3. Statistics 599 


Regression and correlation 


A large and important field in statistics is the analysis of regression and correlation. They are 
concerned with exposing and describing the dependence of two or more characteristics (random 
variables). Whilst the analysis of regression is concerned with the nature of the correspondence 
between the characteristics, it is the task of the analysis of correlation to determine the degree of 
this dependence. Only the basic concepts for the case of linear dependence of two characteristics 
(random variables X and Y) can be described here. 


Regression. In an investigation in a school the height (random variable X) and weight (random 
variable Y) of the children is measured. It is required to determine whether on average a greater 
weight corresponds to a greater height, whether the correspondence is linear and what average 
weight corresponds to a given height. In this case the measurements show that the answer to the 
first question is in the affirmative. However, it is not possible to answer the other two questions 
without further investigation. The following regression calculation serves to provide an answer. 

The individual pairs of values (x, y), where (x) is the height and (y) the weight of a child, are 
plotted as points with coordinates (x, y) relative to a pair of rectangular axes. The points plotted 
in this way form a point set that either has no particular form or more or less fits a curve. If the 
point set approximates to a line — only this case will be considered —, then the relation between 
the random variables X and Y is described by two lines, the regression lines. For the dependence 
of the weight (y) on the height (x) there exists a line of regression Y = a, + b,x, where the unknown 
coefficients a, and b, are calculated by means of Gauss’s method of least squares. For a sample of n 
pairs of values (x;, y;) ( = 1, 2, ..., m) one requires that 


be Pxe? av Y;)? = Zi1— (a, a b,x;))? 


ail be a minimum. This leads to 


Ei - DOD) 


27.3-5 Point set and regression lines for the 
BP Pit example (here the intersection of the axes is not the 


ize _letuees taeame [es ince: Rest setrsit a HL zero point of the coordinate system) 


The x and y in these formulae are the mean values of the numbers x; and y,, respectively. The 
number 5b, is called a regression coefficient and refers to the dependence of the weight ( y) of a child 
on his height (x). It states that the weight is altered on average by b, when the height increases by 
a unit. The regression line giving the dependence of the weight of a “child on his height can be in- 
troduced in this way (see the example and Fig.). 

If one now asks the question (less meaningful in this particular example) what average height 
corresponds to a particular weight, then one cannot use the equation given above, but must begin 
with the other regression line X = a, + byy and again calculate the unknown coefficients a, and 
b, by the method of least squares. This gives a, = X — b,¥, where 


EG —D%n-W) Ban—+ [Ex ds] 


b, = i= = i=] i=] 
S(n— 9) Et——(En) 
a i 4 i n é Vi 


by is also called a regression coefficient and refers to the dependence of the height (x) of a child on 
his weight (y). This coefficient states that the height is altered on average by b, when the weight 
increases by a unit. If this regression line is likewise introduced, then one observes that the two 
lines intersect in the centre of gravity (x, ») of the point set and make a scissor form. The more 
nearly closed these scissors are, the closer is the stochastic dependence between the random variables 
X and Y. They close up corpletely if a strictly linear, hence a functional relation exists. 


Example: The height x (in cm) and the weight y (in kg. wt.) of 10 school children measured. 
Fig. 27.3-5 shows the point set with the two regression lines; all the necessary calculations are 
contained in the table following: 


600 27. Probability theory and statistics 


(y—PF) |(x—X)?| (vy — 5)? | (—¥) (—B) 


10.9561 x = 139.4, 

6.7081 > = 32.61, 

3.5721 b, = 0.746, 

0.2601 b, = 1.019, 

0.9801 a, = —71.38, 

0.0961 a, = 106.2, 
29.2681 ¥Y =—71.38 + 0.746x, 
16.7281 X = 106.2 + 1.019y. 


32.6041 
32.3761 
133.5490 136.0600 


Correlation. The degree of the dependence, an impression of which is given by the regression 
lines, is measured quantitatively by the correlation coefficient r,.,. 


This correlation coefficient is independent of the units of the characteristics and can take all 
values between —1 and +1. If r,, = +1 or —1, respectively, the relation is directly or indirectly 
linear. If r,, = 0, then there is no relation. In the example above, r,, = +-0.87. 

; The correlation coefficient r,., and the regression coefficients b, and b, are connected by the relation 
ey Dy Dy. 


Methods of statistical estimation 


It is often possible to draw conclusions about one or more characteristics of a population from 
the values of a sample. This is characterized by a random variable. If the analytical form of the 
corresponding distribution function is known, then the values of the parameters contained in it 
must be estimated. For such an estimation many possibilities are available, for example, the median 
or the arithmetic mean for the expectation of a random variable. In 1930 R. A. FIsHER therefore 
drew up criteria for the goodness of an estimate; he demanded that it should be unbiassed, consistent, 
and effective. 

For an unbiassed estimate © of an unknown parameter @, the expectation of 6 should agree with 0; 
for example, the arithmetic mean X or the variance s? of the sample is an unbiassed estimate of the 
expectation 4 or variance o?, respectively, of the random variable characterizing the population. 
A consistent estimate © of an unknown parameter @ should differ from © by an ever smaller amount 
with increasing probability for increasing sample size; that is, for arbitrary « > 0, P( |\6 — O\|<e)>1 
for sufficiently large samples. For example, the arithmetic mean X of a sample is a consistent estimate 
of the expectation u of the random variable characterizing the population. For an effective estimate 2) 
of the parameter @ the variance of the random variable © should be small compared with the variance 
of other possible estimates; for example, the arithmetic mean <x is an effective estimate compared 
with ic median X, since the variance of the random variable x is smaller than that of the random 
variable x. 

The estimate of a parameter can be either a point or an interval estimate. In a point estimation 
the true value of the parameter of the random variable is taken to be equal to the estimate of the 
value obtained from a sample. It will, however, agree with the true value with only a small probability ; 
thus, little is known about the accuracy of the estimate. One therefore seeks by interval estimation 
to determine an interval (6 — 6, 6 + 6) containing the estimate @, in such a way that this includes 
the unknown parameter with probability (1 — «). The number (1 — «) is called the confidence 
ee ade and « is a number such that 0 < « < 1, from which the interval width 206 can be cal- 
culated. 

The most useful method of point estimation for a parameter is the method of maximum likelihood. 
For the normal distribution this was already developed by Gauss. The name, the justification and 
the further development of the method, however, go back to FIisHER. Its principle consists in 
choosing the estimate © of a parameter © in such a way that the likelihood function for the given 
sample has a maximum. This likelihood function shall be given for a continuous random variable X 


27.3. Statistics 601 


with known probability density f(x; ©), where the parameter © is to be estimated from a test sample 
of m independent values x;, X2, ---» Xn- 

One considers the likelihood function L(x,, x2, ---, Xn; 9) = f(x1; ©) f(x2; O) ++» fn; O) as a 
function of the unknown parameter © and chooses as an estimate @ for it that value for which the 

; ; . L 
function L assumes a maximum. Thus, one determines © as a solution of the equation — a 0. 
; ; d(In L 1 dL ; 

In practical calculations the equation is replaced by oe ie eC: aa 0. If the density 


f(x; ©,,©2) of the continuous random variable X depends on two parameters ©, and ©,, then the 
: : . . d(In L) 1 OL 
estimates ©, and ©, for them are given as solutions of the system of equations —-5— aa 
i i 


for i = 1,2. 


Example: The parameters 0, = « and ©, = o? of a normally distributed random variable X 
can be estimated from a sample with the values x, , x2, ---, x,. From the likelihood function 


L(x, X25 +05 Xn My 07) = [1/(2107)]"- exp [—1/(20) & (x, — 14)7] 


one obtains InZL = —(n/2)In 22 — (n/2) Ino? — 1/(207) ¥ (x, — m)* and hence —— 
k=l ft 
" dlnZ n 
= Ifo? = (x, — #) =0 and = = —n|(20*) + I/(20*) 2 (x, — “)* = 0. From these one 
= = 


F P | 3 1 s 
obtains the estimates 4 = — XY x, = X and 6? = — J '(x, — x)’. 
MN kel nH kel 

Finally, the procedure for interval estimation will be given for a simple case in which X is a normally 
distributed random variable of whose parameters « and a? only o* is known. For ym a confidence 
estimate is to be given from a sample with values x,, x2, ---, Xn. The arithmetic mean x = os » Xj 

i=l 

is chosen as the estimate. This is known to be normally distributed with parameters u and o*/n. 
For each « with 0 << « < 1, a A, can be determined from the table for the normal distribution in 
such a way that P(|X — u| << A,o/Vn) =1—o or P(X —A,o/~n< ux +A,0/~n) = 1—«%. 
The confidence interval (x — A,o/Vn, X + 4,0//n) is a confidence estimate of « with confidence 
coefficient (1 — «). 


Example; If X is a normally distributed random variable, then one looks up the value A, = 1.96 
for « = 0.05 and confidence coefficient 1 — « = 0.95 in the table for the normal distribution. 
For a sample of size n = 16 and with standard deviation ¢ = 1.5 one obtains for the parameter 
m the confidence interval P(X — 1.96- 1.5/4 < uw < ¥ + 1.96- 1.5/4) = 0.95; X is the estimate 
obtained from the sample. The parameter y lies in the interval (x — 0.74, * +- 0.74) with a prob- 
ability of 0.95, 


Statistical testing procedures 


In many statistical problems it is not enough to describe the available material by means of a 
frequency distribution or by numerical values, for example, if questions of the following kind are 
to be answered: 

(i) In one neighbourhood 10-year old boys were found to have a greater mean weight than is 
usual. Is this deviation only a random one, or can the difference be due to other causes? — 

(ii) In feeding experiments a number of rats were given a standard feed and others were fed with 
a food to be tested. If at the end of the series of tests a difference of mean weight of the two groups 
is found to exist, then it is required to establish whether the food under test really causes greater 
increase in weight. 

In these questions one wants to know whether the apparent deviations are of a random or a Sig- 
nificant nature. This is decided by test procedures, which all depend on a comparison; for example, 
either the corresponding measured values for two samples are compared with one another, or the 
values for one sample are compared with the corresponding known quantities for the population. 
In the test procedures one starts with the assumption or Aypothesis in the first case that both samples 
examined belong to the same population, and in the second case that the sample belongs to the 
particular population considered, that is, that in both cases the differences are only random. This 
hypothesis is called the null hypothesis (Ho). Correspondingly the other possibility is called the 
alternative hypothesis (A). 

With the help of the test procedure one decides whether to accept or reject the null hypothesis. 
In doing so one must bear in mind that the acceptance of the null hypothesis is nothing more than 
preferring this to the alternative hypothesis. One does not claim that this decision is correct in every 
case, because it rests on a sample of size n and an error is possible. Accordingly one makes the 


602 27. Probability theory and statistics 


decision with an error probability «, which is generally chosen to be 0.05, 0.01 or 0.001. The error 
that occurs when the null hypothesis is rejected, although it is, in fact, correct, is called an error 
of the first kind. The other possible false decision, that of accepting the null hypothesis although 
it is false, is called an error of the second kind. For example, if as a result of a comparison one has 
concluded that a new medicine is better than an old one, although in reality they are both of equal 
value, then one has made an error of the first kind. If, however, one has concluded that both are 
equally good, although the new one is actually better, then one has made an error of the second 
kind. 


Test distributions. To test the null hypothesis one uses test variables that are random variables 
and consequently can be described by means of a distribution. For the tests given in what follows 
the test variables are either normally distributed or are based on other distributions, the f-, the F- 
and the y?-distributions. Tables of these distributions are given in the worked examples. 


Normal distribution. If the test variables are based on a normal distribution, then the error prob- 
ability « has an intuitive meaning, if one considers the tables of the cumulative and remainder areas 
given by the error integral. In these the remainder area «, which corresponds to the error probability, 
is given as a percentage. To an error probability there corresponds a A = |(x — y)/o|; in general, 
44 and o may be assumed to be known and only to be estimated by X and s for large samples. An 
error probability of « = 0.05 or 5%, for example, corresponds toaA, = 1.96; thus, the null hypothesis 
7 grag ig if the value of A calculated from the sample satisfies A > A, = 1.96, and accepted 
ifA<A, = 1.96. 

t-distribution. The procedure described for the normal distribution is no longer possible if 4 and o 
are unknown and have to be estimated from X and s for a sample of small size n. In this case s 
can deviate considerably from o and does not then serve as a good estimate. It can be used only 
if the error probability belonging to A is correspondingly increased. This occurs in STUDENT’S f-di- 
stribution, which takes into account the size of the sample as well as the error probability «. For 
increasing values of n it becomes ever closer to the normal distribution and goes over to it in the 
limit as n — oo. The f-distribution is tabulated for different values of the error probability and of 
the degrees of freedom f/f, in place of the sample size. The degree of freedom is defined as the dif- 
ference between the sample size and the number m of characteristic measurements used in the 
calculation; f = n — m. In each of the following tests the degree of freedom is given. In Fig. 27.3-6 
the normal distribution and the f-distribution for f = 5 degrees of freedom and an error probability 
a = 0.05 are shown. 


gilt) 
— normal distri- 

bution 0.8 
--- f-aistribution 


(f,=8, =60) 


th = at Oo of fs & St 


27.3-6 Normal distribution and 0 4 2 : Ey; 
t-distribution (f = 5) with the regions for 
the error probability « = 0.05 27.3-7 F-distribution and error probability « 


F-distribution. If one selects two samples of size m, and m2 from a normally distributed population, 
calculates the two variances s? and s2 and forms the ratio F = s?/s2, then the resulting frequency 
distribution of these values is a distribution which was investigated by FISHER (1890-1962) and 
is called the F-distribution. It depends on the error probability « and on the degrees of freedom 
fi =n, — land f, = n2, — 1 and is tabulated for different error probabilities and degrees of freedom. 
As the ratio of two squares, F assumes only positive values. Fig. 27.3-7 shows an- F-distribution 
from which the meaning of the error probability can be read off. 

y?-distribution. In connection with Gauss’theory of errors HELMERT (1843-1917) investigated 
the sum of squares of variables that are normally distributed. The distribution obtained in this way 
was later called by Karl PEARSON (1857-1936) the y?-distribution. This is based on the following 
assumptions: X,,..., X, are m random variables that are mutually independent and based on the 
same normal distribution with parameters « and o*. The distribution of the sum of squares 


27.3. Statistics 603 


] n 
y? = a (x, — #)?, where x1,-..,X, are values of the 
k=1 


random variables X;, ..., X,, is called the y?-distribution. 
It depends on the error probability « and the degree of 
freedom f and is tabulated for these values. Fig. 27.3-8 
shows a y?-distribution and the meaning of the error 
probability. 


27.3-8 x?-distribution and error probability « for f = 3 


Procedures for the testing of hypotheses. In the following some test procedures for frequently 
recurring problems are collected together. The choice of error probability depends on the nature 
of the problem and is determined accordingly. In industry and in agriculture, in general, an error 
probability of 0.05 is customary, and in medicine one of 0.01 or 0.001. 


1. Comparison of the mean values x and u. The mean value X of a sample of size m and variance 
s? taken from a normally distributed population is to be compared with the mean value of a normally 
distributed population. The null hypothesis, that the difference between the two is only random, 
is to be accepted with an error probability « if the calculated value ¢, of the test variable 
t = (1/s)/|% — u| Yn is less than the tabulated value t; of the f-distribution for « and f=n—1 
degrees of freedom. 


Table of the t-distribution 


12.71 63.66 2.69 
| 4.30 9.92 01 2.68 
| 218 5.84 2.00 | 2.66 
| 2.78 4.60 1.99 2.65 
2.57 4.03 1.99 | 2.64 
2.45 3.71 1.99 2.63 
2.37 3.50 1,98 2.63 
2.31 3.36 1.98 2.62 
2.26 3.25 1.98 2.61 
2.23 3.17 1.98 2.61 
2.20 3.11 1.97 2.60 
2.18 3.06 1.97 2.60 
2.16 3.01 1.97 2.59 
215 2.98 1.97 2.59 
2.13 2.95 1,97 | 2.59 
212 2. | 1.96 2.58 
1.96 2.58 


2. Comparison of two mean values X, and X,. Two samples of size n, and n2, respectively, are 
independent of one another and are assumed to come from normally distributed populations; 
in addition the deviations of their variances s? and s3 are assumed to be random. The null hypothesis, 
that their mean values X, and x2 differ only randomly with an error probability «, is accepted if 
the calculated value ¢, for the test variable 


a. = n 2(n, — 1) + s3(n2 — 1 
ee Ix1 — xl /( 122 with s2 = st(m, ) 3(n2 ) 
Sa ny + nz my +n,—2 
is less than the tabulated value t; for « and f= n, + nz, — 2 degrees of freedom. 


604 27. Probability theory and statistics 


Example: For two materials, from nm, = 20 and nz = 32 tests, respectively, a mean tensile 
strength of ¥, = 18-107 N/m?, %, = 24°10’ N/m? with variance s? = 4+ 10'* and s3 = 6- 10'* 
is determined. With respect to an error probability of « = 0.05, are the two materials essentially 
different with respect to the breaking strength? — 

For the test variables one calculates: 


- 107 — 9A; 7 (ys 
— |18 + 107 — 24- 107| V 20 - 32 ) =920, 


pa g SACS O10“ Sk go 10 
3 = 20 + 32—2 thx | 


The tabulated value of the r-distribution for anerror probability of« = 0.05andf=n, +n,—2=50 
degrees of freedom is tr = 2.01. Because f, > t;, the null hypothesis must be rejected; this means 
that there are significant differences between the two means. 


3. Comparison of two variances s? and s2. Two samples of size ny and nz are assumed to be in- 
dependent of one another and to be taken from a normally distributed population. The null hypothesis, 
that their variances s? and s2 differ from one another only randomly, is assumed with an error 
probability « if the calculated value F, for the test variable F = s?/s2 ,s? > s2, is less than the tabulat- 
ed value Fy for « and f; = n, — 1 and f, = nz — 1 degrees of freedom. 


Example: Two machines are to be compared with respect to the tolerances maintained by them 
in a given manufacturing process, to determine whether, with respect to an error probability 
of « = 0.05, they differ essentially. For this purpose n, = 25 and m, = 31 tests are carried out 
on the first and second machine, respectively, and the variances s? = 17.9 and s3 = 17.5 for the 
results obtained are calculated. For the test variable one obtains F, = (17.9)/(17.5) = 1.023. 
The tabulated value of the F-distribution for an error probability of « = 0.05 and f, = 24, f2 = 30 
degrees of freedom is F; = 1.89. Because F, < F,, the null hypothesis must be accepted, that 
is, the differences in the tolerances of the two machines are only random. 


F-distribution for « = 0.05 (f; = degrees of freedom of the greater scatter) 


161.4 | 199.5 | 215.7 | 224.6 | 230.2 | 234.0 | 238.9 | 243.9 | 249.0 | 254.3 
18.51} 19.00; 19.16]; 19.25] 19.30] 19.33] 19.37] 19.41] 19.45] 19.50 
10.13; 9.55] 9.28 9.12 9.01 8.94) 8.84 8.74 8.64 8.53 
han 6.94 6.59) 6.39 6.26 6.16 6.04; 5.91| 5.77] 5.63 
6.61 5.79 5.41 5.19 5.05 4.95 4.82 4.68 4.53| 4.36 
4.96) 4.10 3.7] 3.48 x 3.22 3.07 2.91 2.74 
4.35 3.49 3.10 2.87 2.71] 2.60 245| 2.28 2.08 
4.17 k PK F 2.92 2.69} 2.53 2.42 2.21 2.09 1.89 
4.08 3.23 2.84 2.61 2.45 2.34 2.18 2.00 1.79 | 
4.00 3.15 2.76 2.52 pi YY 2.25 2.10 1.92 1.70 
332) s07| 2.68 2.45 2.29 2.17 2.02 1.83 1.61 

3.84| 2.99 2.60 2.37 2.21| 2.09 4 | 1.735 1.52 


gt ea a pat sel ee 
SAS=REF 


4. Comparison of frequencies. If an event occurs z times in a sample of size n whose elements are 
independent of one another, but occurs in the population with probability p, then the deviation 
of the relative frequency z/n from p is only random, with an error probability «, provided that the 
calculated value ¢, of the test variable 


_  |z—np| _ _|zn—p\_. 
‘Vind — py *~ Viod—p)} us 


is less than the tabulated value t; for « and f= n — 1 degrees of freedom. 


Example: On the basis of observations over a long period the mortality rate for a certain animal 
disease is p = 0.4. A new drug is tested on » = 71 animals that contract this disease, and z = 20 
of them die. Is the drug a suitable treatment, with respect to an error probability of « = 0.01? — 

According to the null hypothesis, that the deviation of the relative frequency z/n = 20/71 from 


. F |20 — 71 -0.4| | 
= = peerage FOS ee ee tom — * * 2 L ted ! of 
p = 0.4 is only random, one calculates fr, (71-04-06) 2.035. The tabulated value 


the f-distribution for an error probability of « = 0.01 and f= n — 1 = 70 degrees of freedom 
is f; = 2.65. Because ¢, < f7, the null hypothesis can be accepted. 


27.3. Statistics 605 


5. Testing of distributions. An empirical distribution deviates only randomly from a theoretical 
distribution with an error probability « if the calculated value y2 of the test variable 


k 
xy? = D [(h; — k,)*/k,] is less than the tabulated value y7 for « and f= k — m— 1 degrees of 
i=1 


freedom, where m is the number of unknown parameters estimated from the sample. Here the 
material under investigation is divided into k classes, and h; is the observed value, k; the theoretical 
absolute frequency in the ith class (i= 1,2,...,k). The theoretical absolute frequency in each 
class is required to be at least 5. This can be achieved by combining several classes if necessary. 


Example: A certain characteristic is measured on 80 articles made on a machine. The resulting 
measurements are divided into classes and the frequencies /, given in the table are obtained. It 
is required to test whether, with respect to an error probability of « = 0.05, the measured values 
correspond to a normal distribution. For this purpose the theoretical frequency &,; belonging to 
each class is calculated with the help of the Gaussian distribution and the null hypothesis is tested, 
that the deviation of the empirical distribution from the theoretical one is only random. The test 
variable is calculated with the help of the following table, which contains the individual steps in 
the calculation: 
| (A, — k,)? 


(hk, — k,)*/k; Table of the ¥*-distribution 


Corresponding to the calculated value 7? = 2.09 the 
tabulated value of ¥* for an error probability « = 0.05 and 
2 degrees of freedom (since 2 parameters were estimated) 
is looked up, giving 72 = 5.99. Because 7? < 77, the null 
hypothesis is accepted. 


Fields of application of statistics 


From the numerous fields of application technological statistics and biometry will be selected 
here. 


Technological statistics. The first beginnings of a statistical treatment of problems of technology 
go back to the beginning of the nineteen-twenties. At that time Karl DaEves (b. 1893) recognized 
that the mass production of modern industry brings with it aspects of measurement that follow 
certain regularities. He collected together his examination procedure under the concept of large 
number research. But only in the last 25 to 30 years has statistics been applied to any considerable 
extent to industrial questions, for example, to the evaluation of test samples, to estimation from 
a series of measurements, or to current control of production. In the course of this development 
the term technological statistics has evolved. By this one understands a collection of all statistical 
methods that can be applied in technology and that are specially tailored for it. 

These methods can be divided into two groups: 

1. Methods for the statistical examination and evaluation of self-contained observation material. 
This is concerned essentially with statistical estimation and test procedures and with regression and 
correlation analyses to reveal and describe relationships. For this group the following problems 
are typical: the life of electric bulbs; the influence of measuring inaccuracy in the case of mechanical 
instruments of high precision; bending strength of natural wool fibre; tearing strength of certain 
fabrics; determination of the dependence of the tensile strength of a steel on various factors; com- 
parison of the properties of two working materials. 

2. Methods for the initial and final contro] and for the running control of a production process, 
briefly referred to as statistical quality control. These methods are based on the idea of controlling 


606 27. Probability theory and statistics 


the production process by statistical methods in such a way that the rejects are discovered as soon 
as they occur and their causes can be eliminated. 

For this control there are two possibilities: 

(i) the manufacture is regulated by means of control charts in such a way that the number of 
rejects and the finishing work are greatly reduced; 

(ii) the finished and half-finished products are examined by means of test sampling procedures 
to see whether they satisfy certain quality requirements. In this case the finishing cannot be in- 
fluenced; one can only draw inferences for future production. 

Control charts. A finishing process is controlled by means of a control chart in such a way that 
characteristics of the product considered can be judged and it can be decided whether deviations 
from the required value are random or systematic. According to the kind of judgement one distin- 
guishes: 

({) control charts for measurable characteristics, 

(ii) control charts for non-measurable characteristics. 

The design and use of a control chart is illustrated by a single-value chart (Fig.); a corresponding 
procedure holds for other control charts. For the control of a characteristic in a particular production 
process, objects are chosen at random, that were finished at the end of certain time intervals, and 
the characteristic value in question is measured. Instead of being entered in a book, this value is 
plotted above a time mark in the control chart. If one imagines the set of data entered as a frequency 
distribution, then in many cases the resulting distribution is normal, at least approximately. If 
one draws its mean value X as mean line (ML) and the value x -+ 3s, Xx — 3s as upper control limit 
(UCL) and lower control limit (LCL), respectively, then 99.73% of all measured values must lie 
in the region bounded by the control lines. If the three lines have been determined by preliminary 
tests, then the production process can be controlled with the chart prepared in this way. If a measured 
value lies within the region, then the deviations from the mean may be regarded as random. The 
deviation is systematic if an entry lies outside the control lines (indicated in the figure by an arrow). 
In this case one has to look for the reason for the disturbance before continuing production. The 
graphical picture of the control chart always provides a better survey than a list. It shows the develop- 
ment of the characteristic considered during the production process and indicates when faults 
should be eliminated and new adjustments made to the machine. Because of these properties the 
control chart is very effective in operation and, in connection with the search for the origin of faults, 
it indicates frequently recurring faults and hence leads to changes in the technology. 


measured value | 


= on aa a Sp -* r 1 
= . : . / i i Li(p/} 
i t iH 1 1 ; 
. wee f : 
i oe | : : i | = 
ae - Lye o 22 ee 


a ene 


consumer's 
risk —__ 


=—— intemal ee am - 
| 1 | 
— ie ee 
‘ ] 
1 1 
1 1 4 


4 ae ee 0 | 15 time axis or | : 
| peeotisits number of withdrawal” O a 6 & 10 p 


27.3-9 Single-value chart 27.3-10 Operation characteristic 


Sampling plans. Control charts are used when a running statistical control is to take place during 
a finishing process. But these methods fail, in general, if material of unknown quality is supplied, 
or if one forgoes a control during the finishing process and requires a quality examination to take 
place afterwards. In both these cases, in initial and final control, one could exercise 100% control. 
However, this is very costly. Besides, even with 100% control, there is no guarantee that all defective 
parts will be found, as experience confirms. One is therefore content with sample testing, whereby 
the acceptance or rejection of a consignment is decided by the quality of a sample drawn from the 
consignment. The tests, here for example good-bad-tests, are carried out according to a sampling 
plan. From a consignment of N items a sample of size n is withdrawn and tested. If it contains more 
than z bad parts, then the consignment will be rejected; it will be accepted with at most z bad parts. 


28.1. Calculus of errors 607 


Thus, the sampling plan is characterized by the number pair (z, n). It is based on the assumption 
that the percentage of faults in the sample agrees with that in the consignment. This occurs only 
with a certain probability, so that the producer and the consumer both take a risk in accepting the 
sampling plan, which can be seen from the operation characteristic (Fig.). It represents the acceptance 
probability L(p) for the consignment in dependence upon its percentage rejection rate p, and its 
form depends upon the sampling plan. It is calculated, in general, by means of the binomial or 
Poisson distribution. From the figure the risk of the producer or of the consumer can be seen, 
that is, one can read off the probability that a consignment with a percentage rejection rate p shall 
be rejected or accepted. It is a matter for agreement by contract to choose a sampling plan and with 
the help of the operation characteristic to determine the permissible percentage rejection rate p. 
In practice it has become usual to choose a plan in such a way that a consignment with p, % rejec- 
tion rate will be accepted with a probability of 95% and one with p.% rejection rate (p; < p2) 
will be rejected with a probability of 90%. 


Biometry. K. PEARSON defined biometry as the study of the application of mathematical (statistical) 
methods to the examination of the multiplicity of life. In the investigation of the laws and phenomena 
of living things one meets an incomparably more difficult situation than, for example, in physics. 
There it is possible to plan experiments that are reproducible. The conditions of the test can be 
kept constant and only the variable under investigation varies in order to find the required law. 
In biology, however, very Many factors operate that cannot be influenced by the investigator (for 
example, the effect of weather on the cultivation of plants). In medicine the methods are still more 
problematical, since experiment is usually excluded (on people, for example, on ethical grounds) 
and often only pure observation remains. To this must be added the multiplicity of biological 
measures and values, which is called biological variability. For these reasons a well thought-out 
test plan for the arrangement, execution and evaluation of the tests is absolutely essential. In the 
course of the development of biometry special methods for the planning of tests or observations 
have evolved. For the evaluation of tests or observations special methods were created which 
particularly take into account: 

(i) the usually small size of samples governed by the difficulty of satisfying the homogeneity 
requirement; 

(ii) the frequency distributions that cannot be reduced to a normal distribution. 

The difficulties and at the same time the attraction of biometry lie in finding statistical methods 
that are best suited for the given problems of reality. 


28. Calculus of errors, adjustment of data, approximation theory 


28.1. Calculus of errors ............... 607 Adjustment of observations ........ 619 
Absolute and relative error ........ 608 Adjustment of relations ........... 621 
Accuracy of the result obtained by cal- Representation of a function with the 
culating with approximate values ... 610 help of simpler functions .......... 624 
Errors of measurement and observa- 28.3. Approximation theory............ 624 
HOM rw nbepsce tet etacasohaes 613 Approximation methods for calculat- 

28.2. Adjustment of data .............. 614 ing the values of a function ........ 625 
The method of least squares ....... 615 Approximation of functions by means 
Mean error and law of propagation of polynomials ........ 0. ccc cc ee 626 
OF CTIOWS cosh cece wages nee tas pees 616 Interpolation in tables............. 630 


Adjustment of direct measurements . 618 


28.1. Calculus of errors 


The calculus of errors is concerned with the precision of numerical a? = 56.1001 
data and the results of calculations. Errors that rest upon false bh? — 28.3024 
mathematical reasoning, failure to pay attention to the laws of calcu- a? +b? = 84.4025 
lation, haste and lack of care in calculating, are not the subject of cos y = 087 
the calculus of errors. It does not excuse the calculator in any way 2ab cos y = 69.3334 
from exercising extreme care in carrying out the operations of calcu- c2 = 15.0691 
lation. From the sides a = 7.49, 6 = 5.32 of a triangle and the includ- c = 3.88 


ed angle y = 30° a schoolboy calculates the length of the third 


608 28. Calculus of errors, adjustment of data, approximation theory 


side c from the formula c = y(a? + 6? — 2abcosy) (see the accompanying calculation). The 
teacher finds the result too inaccurate; he had expected the solution c = 3.92. In looking up 
cosy the boy has clearly taken too few decimal places into consideration. The questions arise, 
what error occurs as the result of the neglected decimal places, and how many places of cos y would 
have had to be retained in order to give the required accuracy (see Accuracy of the result obtained 
by calculating with approximate values, Example 4). 


Approximate values. In practical applications the numerical values of 
measured quantities are known only approximately. A man wants to drive 
his car to London. A road sign gives the distance as 75 miles. His car co- 
vers on average 30 miles per gallon of petrol. He therefore calculates that | Lo Nn d on 75 
his fuel requirement for his journey will be (75/30 = 2.5) gallons. How- 
ever, the road sign does not give the ‘true value’ of the length of his 
journey, but only an approximate value. For such a calculation a more 
accurate statement of distance would have no particular value. The aver- 
age petrol consumption of his car is also an approximate value (Fig.). 

Even for pure numbers often only approximate values can be used in calculations, because most 
numbers can be represented in the decimal system only as infinite decimals, for example, //2, 2, lg 3. 

If one wishes to indicate that a is an approximate value for a quantity x, one usually writes x + a; 
x is the exact value, a the approximate value. For example, V2 + 1.41; 7 3.14; lg 3 = 0.4771. 


28.1-1 This distance sign 
is rounded to whole miles 


Absolute and relative error 


Error and absolute error. Each approximate value a is judged by its deviation from the exact 
value x, the difference « = a — x is called error « = a — x, its absolute value |e] = la — x| absolute 
error. The smaller |e| is, the more accurate is an approximate value a. For example, the approximate 
value a, = 0.66667 for x = 2/3 is a hundred times more accurate than the approximate value 
a, = 0.667. 


| 1 = a — “a a »_ lr ¥! é ely ° ; 
“ x Sea oe ee | eee 


Correction. If one wishes to obtain the exact value x of a quantity from the approximate value a, 
One must add to a the correction c:c = x -—a= —6. 


Relative error. Instead of the absolute error |e| of an approximate value a, its relative error |é/x| 
is often given. It is usually expressed as a percentage. The accuracy of approximate values for 
different quantities can then be compared with one another in this way. 

Example: For the exact values x = 2/3, y = 1/15 respectively, the approximate values a, = 0.67, 

a, = 0.07 are used. Then the errors are ¢, = a, — x = 0.67 — 2/3 = 1/300 and e2 = a, — y 


at = 1/200 = 0.005 


= 0.07 — 1/15 = 1/300; for the relative errors one obtains |e,/x| = 


EE 
= 0.5% and |e2/y| = “ne = 1/20 = 0.05 = 5%. Although the absolute errors are equal, 


a, is a ten times more accurate approximation for x than a, for »y. 


Bounds for the absolute error. Every statement concerning the magnitude of the absolute or relative 
error of an approximate value represents a statement about the accuracy of this approximation. 
However, the exact value is usually unknown, for example, for approximate values obtained from 
measurements. Then neither the absolute nor the relative error of the approximation can be cal- 
culated. In such a case bounds for the error or the relative error of the approximation ought to 
be given. By a bound for the (absolute) error of an approximate value a one understands a positive 
number Aa that is never exceeded in value by the absolute error. The inequality 

—Ma<xe<4da or a—Aaxx<a+da 
always holds. If a bound Aa is stated, then at the same time both a lower and an upper bound for 
x are given. This is expressed for short in the form x % a(+4a), or x = a + 4a; Aa gives informa- 
tion about the accuracy of a. The smaller 4a is, the more accurate is the approximation a. 


a ie: wm ti Mi a * 
es 


Alternatively, if two bounds x, and x2 for a quantity x are known, such that x; <x < Xp, 
then a = (x, + x2)/2 is an approximation for x with 4a = (x2 + x;,)/2. 


Bounds for the relative error. In technical data the accuracy is often given in the form 
x ~ a(+d- 100%) or also x = a + a-6- 100%. The quantity 6 = |Aa/a| is a bound for the relative 
error of a. 


28.1. Calculus of errors 609 


Example: The capacitance of a capacitor is given as 250 pF + 10%. The relative error of the 
approximate value a = 250 pF is 6 = 0.1. From this it follows that Ja = a- 6 = 25 pF is a bound 
for the absolute error. Consequently the exact value of the capacitance lies between 225 pF and 
275 pF. 


If approximate values for numbers such as x, /2, lg3 are given, then as a rule one dispenses 
with a special statement of the accuracy. The representation of such approximate values is subject 
to certain rules that permit to infer at once the accuracy of the given tables. 


Truncation. The number 2 can be represented only by an infinite decimal expansion. In a table, 
for example, one finds the number given as 2 = 3.141 592 653 589... In this way the table gives as 
approximation for z a decimal truncated after 12 places. In truncating an infinite decimal, its sequence 
of digits is cut off completely at a particular place. The three dots at the end indicate that further 
digits would follow. In this all the digits given are valid digits, that is, the sequence of digits in the 
truncated decimal agrees completely with the sequence of digits in the non-truncated decimal, up 
to the place after which they were discarded. 

The number z, truncated after 4 decimal places, is therefore 7 = 3.1415... Any of the digits from 
0 to 9 can follow as the next one after the last digit of a truncated decimal. Hence if one uses a 
number truncated after & digits as an approximate value, then the absolute error of this approxima- 
tion is negative and its absolute value is less than a unit of the order of the last digit included. A 
number truncated after k decimal places therefore has an error less than 10-*; for example, for 
mt = 3.1415 the absolute error is less than 10-* = 1/10 000. 


Rounding off. A usual method of curtailing decimal places is by rounding off. In this the last digit 
retained is unchanged, as in the method of truncating, if it is followed by a 0, 1, 2, 3 or 4 (rounding 
down). The last digit retained is increased by 1, however, if it is immediately followed by a 5, 6, 7, 8 
or 9 (rounding up). Consequently an approximation for z rounded to four decimal places is ~ 3.1416; 
its absolute error is less than 10~*/2. If one follows this rule, then one has the guarantee that the 
absolute error of a rounded number has absolute value less than half of a unit of the order of the 
last digit given. It can, however, be positive or negative. Only in the case for which the first digit 
neglected is a 5 followed by zeros is the rounding error exactly equal to half of a unit of the order 
of the last digit retained. It is customary in this case to round in such a way that the last digit retained 
is even. For example, 1/8 = 0.125 00 would be rounded to 0.12, but 7/40 = 0.175 00 to 0.18. 


Reliable digits. The digits of a rounded number need not all be valid digits, since they may be 
the result of rounding up. But a correctly rounded number has only reliable digits. All digits of an 
approximate value are called reliable if the absolute error of this approximation is at most a half 
unit of the order of the last digit retained. 


Example: In a five-figure table of square roots the value 6.245 00 is given for /39. This value 
has only reliable digits, because its absolute error is less than 5 - 10~°. Its last three digits, however, 
are not valid digits, because |/39 = 6.244.9979... Thus, if an approximate value contains only 
reliable digits, a statement of accuracy need not follow at the same time. On the other hand, if 
the accuracy of an approximation is not given, one must assume that all its digits are reliable. 
This occurs, in particular, for all numbers given in mathematical tables. 


If there is no risk of misunderstanding, one often writes x = a instead of x ~ a, if a is an ap- 
proximate value for x resulting from rounding off. 


Significant and non-significant digits. A difficulty arises in rounding large numbers. For example, 
if the number 1778 is rounded to the nearest hundred, one obtains 1800. This number is correctly 
rounded, but does not contain only reliable digits, because the absolute error is greater than 0.5. 
In place of the neglected digits 7 and 8, zeros are introduced. They serve only to fix the order of 
magnitude of the rounded number. They are called non-significant digits. The introduction of non- 
essential digits (zeros) can give rise to misunderstanding in the consideration of accuracy. One 
therefore uses another notation with the help of powers of ten. In the case in question one writes 
18 - 10? for the rounded number. But if the number to be rounded is 1799.7, then both the zeros 
in 1800 are significant digits. They are included, for example, in the form 1.800 - 10°. 


The rounding of rounded numbers. A further difficulty is encountered if numbers that are already 
rounded have to be rounded yet again. For example, if the number 0.4747 is rounded to two decimal 
places, this gives 0.47; if however the number is first rounded to three places, and this number is 
again rounded to two places, one obtains first 0.475 and finally 0.48. The last digit is now no longer 


610 28. Calculus of errors, adjustment of data, approximation theory 


8.8 reliable. The uncertainty resulting from repeated rounding always 

F f 594 ro .-—« occurs when the last digit has to be rounded to a 5S. It is therefore 
8.87695 useful for possible further rounding to notice whether a 5 in the last 
8.87795|*° place of a rounded number is genuine, or the result of rounding up 
8.87895) 7° or rounding down. A 5 is sometimes characterized by a bar or a point, 
————-., respectively, above it, according as it is the result of rounding up or 
8.87995) the digits following it are discarded (Fig.); 2.6146 = 2.615 ~ 2.61; 
looo...|9 2.6153 2.615 © 2.62. 


28.1-2 Characterization of a 5 in the last place in a table of logarithms 


Accuracy of the result obtained by calculating with approximate values 


Initial error and error of calculation. If a calculation is carried out with approximate values, 
then, in general, the result will likewise be only approximately correct. In the first place the inaccuracy 
of the result depends on the errors of the approximations entering into the calculation. The error 
in the result caused in this way is called the initial error. Further, a certain error occurs in the course 
of the calculation itself, for example, through rounding up or down. This is called the error of 
calculation. The error of calculation must always be smaller than the initial error, or else the ac- 
curacy of the initial data is not fully utilized. 


Method of bounding values. The method of bounding values yields the most exact determination 
of the accuracy of the result of a calculation. In this one finds a lower and an upper bound for the 
result from the lower and upper bounds of the initial values. For the basic methods of calculation 
simple rules can be given. Let L(x) and U(x) be the lower and upper bounds, respectively, for the 
value of x, L(y) and U(y) lower and upper bounds for the value of y. Then 


L(—-x) = —-U(@x), Laty=LX) 4+), La—y)=L(x) — Uy), 
U(—x) = —L(x), U(x+y)= U(x) 4+ Uly), Ul(x—y) = U(x) — L(Y). 


These relations can be derived from the inequalities L(x) < x < U(x), L(y) < y < U()); for example, 
from the inequality L(x) < x < U(x) it follows on multiplying by —1 that — U(x) < —x < —L(x), 
so that — U(x) is a lower bound for —x and —Z(x) an upper bound for — x. 


From x = U(x) U(x)= x Lix) = x x = U(x) 
follows x+y S U(x)+y; L(x) + yix+y; L(x)—y i x—yy x—y <= U(x)—Y9; 
and since ys U(y) Liy) <= y y= U(y)| Ly) <y 

it follows x+y < U(x) + U(y)¥ Lx) + L(y) < x+y L(x)— Uy) < x— yV x— yy < U(X) — Ly) 


In determining the bounds of xy and x/y, the signs of the bounds play a part. If x and y have only 
positive bounds, then 


L(xy) = L(x) L(y), U(xy) = U(x) U(y), 
L(x/y) = L(x)/U(y), = U(x/y) = U(x)/L(y). 


When rounding off in the course of calculation care must be taken that lower bounds may only 
be reduced as a result of the rounding, and upper bounds only increased. 


Example I: The height A of a frustrum of a | 
right circular cone is to be calculated from its — 
upper radius rz ~ 61(+0.5) in., its lower radius | 
r, = 74(+0.5) in. and its slant height s = 
82(+0.5) in. The formula is A=)/[s?—(r; —r2)*]. 
The adjacent calculation gives 80.28 in. as a 
lower bound for the result and 81.63 in. as an 
upper bound. The result can be expressed more 
concisely in the form é = 80.955(+-0.675) in. 
Or aS a somewhat coarser approximation A= 
81.0(+0.8) in. 


This method is rather pessimistic, because in connection with arithmetic operations the operands 
are considered as independent and the worst combinations get the same weight as the most probable 
ones. Therefore probabilistic and socalled fuzzy methods were proposed to get more realistic 
estimations for error bounds. But these more sophisticated methods are in the concrete applications 
advisable only if the interval bounds lead to undiscussable consequences. 


28.1. Calculus of errors 611 


Method of limiting error. The method of bounding values takes account both of the initial error 
and of the error of calculation. Its application is, however, very time consuming, since each cal- 
culation must be carried out twice. 

If one is interested only in the initial error, then the method of limiting error leads more quickly 
to the goal. It rests on the following principle. 

A function f(x,, x2, ---, x,) of & variables is to be calculated. For the values x, , x2, ..., x, meeded 
for the calculation, only approximate values a,, a2, ..., @, are available. It is required to estimate 
the error associated with the result if the calculation is based on the approximate values a, , a2, ..., ay. 
Suppose that the approximations have the absolute errors €, = 4a, — x1, €2 =a@z—X), ... 
€, = a, — x,, which are very small compared with the values a;. The exact result would be 


f(x1,X2, sees Xn) = f(a — Ey, Az — En, +++5 Ay — Ex). 


If one expands the right-hand side of this equation in a series, by the method known from the 
differential calculus, neglecting terms of higher order in the absolute errors €;, one obtains 


Siete Se 
Ox, 2 Ox2 FOX, | 


The values of the partial derivatives of f(x,,...,.x,) at the point x, = a,, x2 = Q2,..., Xp, = a 


are to be taken here. Up to terms of higher order in the ¢,, this equation gives for the error of the 
result the expression 


Eg = f(Q,, «++, Ay) — f(X1, «++, Xy) 
= 4 fix,(Ay, +++) Ay) + Er fieg(@1 , +++, Ay) Hoot + Efe (Gry +++) Ax). 


Thus, the absolute error can be estimated as follows: 

ler] < ler] [fe(@r, +++ e)] + feat [feelin «+5 a )] oe + ek | [Fx (Qi » sy Aq)] 
If bounds Aa,, Aa2, ..., da, for the errors €,, €2, ..., & are known, this inequality can be sharpened 
sed legl < Aay| fe (@y, ..., %)| + Agel fi(a1, --., | +--+ + Aayl fe,(Qi, .--, &)| = AF. 


The value of Jf gives a bound, to a good approximation, for the absolute error of the result. 


fF (X15 +65 Xe) = F(A, -++5 A) 


From this equation it is possible to calculate a limiting error for the result from limiting errors of 
the approximate values entering into the calculation. Neglecting in its derivation the terms of higher 
order in the errors «,; ( = 1, ..., &) makes hardly any difference in practice. 


Application of the method of limiting error to the elementary rules of calculation. If a and bare 
approximations with limiting errors 4a and Ab for the quantities x and y, respectively, then the basic 
equation assumes the following forms: 

Addition: fxy=xt+y; |Al=1; |Al=1; Af=Aas Ab. 
Subtraction: f(x,y)=x—y; |fKl=1; |Al=1; Af=4ast Abd. 
The sum of the limiting errors of two approximate values tepresents a bound for the absolute error 
of the sum and of the difference of the two approximate values. 

Multiplication: f(x, y) = xy3|fx| = |y|3|4,| = |x]; 4f = |a| 46 + || 4a; division by | f(a, b)| = |ab| 
gives Af/|f| = Aa/|a| + Ab/|b|. °° 

Division: f(x,y) = xly, \fx| = MU yls |f) = [xl/y?; Af = Aa(t/|b|) + Abjal/b?; division by 
|f(a, b)| = |a/b| gives Af/|f| = Aa/|a| + Ab/|d|. 

The sum of the bounds of the relative errors of two approximate values represents an approximate 
bound for the relative error of the product and of the quotient of the approximate values. 


Raising to a power: f(x) = x"; |f,| = |nx""'|; Af = Aalna"""|; division by |f(a)| = |a"| gives 
Af||f| = |n| + Ma/|al. 
n times the bound for the relative error of an approximate value is an approximate bound for the 
relative error of the nth power of this approximate value. 


The method of limiting errors can also be applied to more complicated calculations that are 
composed of sums, differences, products and quotients. 


612 28. Calculus of errors, adjustment of data, approximation theory 


Example 2: The calculation of the 
expression f = ab/e fora = 2+ 0.1, 
b= 440.2, c=2.5+0.1 gives 
f = (2- 4)/2.5 = 3.2 with a relative 
error Af]| f | = Aa/\a\ + Ab/|b| + ae 
= 0.1/2 + 0.2/4 + 0.1/2.5 = 

=14% and an _ absolute uae 
Af = 3.2+0.14 = 0.448. The result 
is f = 3.2 + 0.4. 

Example 3: The area of a triangle 
with two sides and the included angle 
given by a= 5.2(+0.05) 

b =~ 3.4(40.05) and 
y = 35°(+10’) (Fig.) is 

A = */,ab sin y = 5.070. 
The estimate of the error is 


4A = '/2 Aa\b isin | 
+ 1/2 \a| Ab |sin y 
ey He || |cos | Ay, 
tAa -Aa AA||A| = Aa/\a| + Ab/|b| 
| . + |cos y/sin y| dy 
28.1-3 The exact triangle to be calculated lies between the = 0.05/5.2 + 0.05/3.4 
triangles A” BC” and A’BC’ + 1.428 - 0.0029 


= 0.0096 -+- 0.0147 - 0.0042 
= 0.0285 = 2.85%. 
Consequently AA = 5.070- 0.0285 = 0.144 or A# 5,070(+0.144), that is, 
Ar 5.070(+2.85%). 


The basic equation of the method of limiting error connects the bounds for the errors of the 
initial values with the bound for the error of the result of a calculation. If only a single approximate 
value enters into the calculation, then from this basic equation one can calculate the accuracy 
that must be chosen for this approximation in order to ensure a desired accuracy for the result. 
In this case the basic equation has the form Af = |f’(a)| 4a, where a and Aa denote the approximation 
and its limiting error. If the result is to have an error not exceeding 49, then one must have Af < Ao 


or Aa < Ao/|f’(@)|. 


Example 4: How many decimal places of cos y must be taken into account so that the absolute 
error in the length c of the third side of a triangle, calculated from the two sides a = 7.49 and 
6 = 5.32 and the included angle y = 30°, is less than 0.005? - 

ab ab 


dc 
— 2 2 " . a ee SS ESS 
c= y(a* +5 2ab cos y); hence cos y) V(@? + b — 2ab cosy) - 


and Ac = A(cos y) (ab/c) = A(cos y) - 10.2. 
Because Ac < 0.005 must hold, A(cos y) < 0.005/10.2 = 4.9- 10-*. Hence the value of cosy 
must be accurate to at least three places of decimals. 


28.1. Calculus of errors 613 


If the result of a calculation depends on several initial values, then the problem of determining 
from the/basic equation for the limiting error the accuracy that must be required of the initial data 
in order to ensure a given accuracy for the result is, of course, indeterminate. There is only one 
linear equation available for the calculation of several variables. However, by means of the basic 
equation one can estimate the magnitudes of the influence of individual errors on the result to 
recognize which initial values must be chosen with particular accuracy. 


Example 5: The volume of a right circular cone is to be determined. The diameter of the circular 
base d-~ 16 and the height A ~ 32 are measured. How accurate must the measurements be and 
how many decimal places of a must be taken into account in the calculation so that the relative 
error of the result does not exceed 1% ?-— The volume of the cone is given by V = (2/12) hd?. If 
An, Ah, Ad are the bounds for the absolute errors of 2, h, d respectively, then a bound for the re- 
lative error of the volume is given by AV/V = An/x + Ah/h + 2Ad/d. From the condition 
AV/V < 0.01 one obtains the inequality 

0.318 31Anm + 0.031 254A + 0.1254d < 0.01. 


It is not possible to give a unique estimation of 4a, 4A and Ad using this single relation. One can 
see, however, that an error in determining the diameter has an effect on the error of the result 
about four times as great as an error in measuring the height. The diameter must therefore be 
measured with particular care. An accuracy of dd = 0.1 in the measurement of the diameter is 
not enough. The relative error of the result because of this error alone could be 1.25%. If the dia- 
meter is measured with an accuracy of 0.05, then 4d = 0.05 and bounds for the other two errors 
must satisfy the condition 


0.318 314% + 0.031 254h < 0.003 75. 


The error in the measurement of the height could then 
be at most 0.12, and 2 would have to be free of error. 
If the accuracy of the height measurement is increased 
to dh = 0.1, then the bound for the error of ~ must 
satisfy 0.318 3142 < 0.000 625 or Am< 0,002, One can 
satisfy this cendition by taking the value 3.14 for x, 
rounded to two places of decimals. Hence one can cal- 
culate the volume of the cone with an accuracy of at 
least 1% if one measures the diameter with an accuracy 
of Ad = 0.05 and the height with an accuracy 4h = 0.1 
and takes 2 = 3.14 (Fig.). 


28.1-4 The exact section of the right circular cone lies lew A” d B a7 
between the figures 4’B’C’ and A” B’'C” Ad, 


Errors of measurement and observation 


Errors of measurement. If the approximate value a for a quantity x has been obtained by measure- 
ment, in this case the error € = a — x is called the error of measurement or the exact error. Errors 
of measurement are those that are unavoidable, when one disregards gross errors, for example, 
those that can result from inattention or mishandling of the measuring instrument. They originate, 
on the one hand, from the precision of the measuring instrument (instrumental errors) and, on the 
other hand, from involuntary errors made by the person doing the measuring in making adjustments 
and taking readings (personal errors). Instrumental errors often occur as regular errors, that are 
either constant or systematic. For example, if one reads the time on an absolutely accurate clock 
that is wrongly set, then this time measurement has a constant error, namely the precise amount by 
which the clock is wrongly set. But if one knows that a clock gains five minutes in the course of a 
day, theri the time read on this clock contains a systematic error. Its magnitude depends on how 
much time has elapsed since the clock was last put right. Many constant and systematic errors 
must be taken into account in measurements. Because of their regularity, however, they can always 
be determined and eliminated. 


Errors of observation. The situation is different for irregular or random errors of measurement. 
They are likewise unavoidable, but it is not always possible to eliminate them. For the most part 
the personal errors of the observer must be regarded as random errors. They are then called errors 
of observation. But random errors can also be produced by uncontrollable, random influences 
during the measuring process. 


614 28. Calculus of errors, adjustment of data, approximation theory 


A single measurement suffices in itself to provide an approximate value a for a quantity x, but 
from this single measurement nothing can be said about the random error of measurement ¢ = a — x. 
It can be at one time larger, at another time smaller, positive or negative. Naturally, from a knowledge 
of the measuring instrument and the care and experience of the observer, a bound 4a for the mea- 
surement error can be given that will certainly not be exceeded, but will often be somewhat coarse. 
For this reason the measurement is carried out not only once, but a number of times, and if possible 
the individual measurements are undertaken by different observers. If m measurements of a quantity x 
are made, the 7 results a;, a2, ..., a, of the measurement do not, as a rule, agree completely, and 
especially not if exacting demands are placed upon the accuracy of the readings and values between 
the calibration marks of the measuring scale must be estimated. Such estimations always contain 
an element personal to the observer. Moreover, the accuracy of the adjustments always varies from 
measurement to measurement. A purely physiological source of error also arises from the fact 
that the determination of coincidences with the naked eye is not possible without ambiguity. The n 
measurements a; give rise to n equations 


€; = a,;—X, Ez = QA — X, +s, En = An — X 


for the m exact errors of measurement ¢; and the unknown exact value x, that is, for (7 + 1) un- 
knowns. In the adjustment of data methods are developed for finding the best possible approximation 
a for x and calculating its accuracy. The possibility of a solution of this problem rests on the fact 
that although the errors of observation ¢; in individual cases can be uncontrollably larger or smaller, 
positive or negative, seen as a whole they follow a strict law. 


The error distribution law of Gauss. Gauss was one of the first to draw attention to the law governing 
errors of observation. The density function of the normal distribution of a continuous random variable 
is given by p(x) = {1/[a /(2%)]} - exp [—(x — 6)?/(2a’)], where a? = o? is its variance and uw = b 
its expectation (see Chapter 27.). The Gaussian law of errors is obtained from this as a density 
distribution g(e) if the error of observation « = x — b is chosen as the abscissa and the relative 
frequency as the ordinate. 


The graph of this function is bell-shaped, extending over the whole abscissa axis (—oo < e < +00), 
and has a maximum at the point e = 0 and points of inflection for ¢ = —o and « = +o. For large 
values of o the curve ¢(e) is flat and wide, and for small o it is steep and narrow. By means of the 
Gaussian law of errors one can calculate the probability that the magnitude of an error of obser- 
vation lies between the bounds —4 and + JZ. This probability is 


+A Error bound | Probability 
P(-A <e<+A4)= f oe) de. A=je P 
-A 
The bound 4 for the error is usually expressed in units of o. One 0.670 0.500 
puts 4 = Ao(A > 0, o > O). Evaluation of the error integral shows 1.000 0.683 
that the absolute value of an error of observation e does not exceed 1.960 0.950 
the bound 4 = Ao with the probability P shown in the adjacent 2.000 0.954 
table. 2.586 | 0.990 
If for a measuring process the standard deviation o of the under- 3.000 0.997 


lying error distribution is known, then bounds 4 = Ao can be given 
that will not be exceeded with a certain probability by the error of observation. Unfortunately, o 
is not usually known in practice. 


The process of adjustment of data shows how one can estimate the value of o from a number 
of measurements of a quantity x, and draw conclusions about the error of observation by means of 
the Gaussian law of errors. 


28.2. Adjustment of data 


The process of adjustment was developed essentially by Gauss and applied to the calculation 
of the orbits of comets and to triangulations, for which he himself had carried out the measurements. 
Even today these methods are indispensible in the treatment of astronomical and geodetical measure- 
ments and, moreover, are applied with advantage in all fields in which exact calculations are to 
be made with the results of observation and measurement. With the help of adjustment it is possible 
from measurements containing errors to determine estimations (approximations) for the quantities 
to be measured, and to state their accuracy. 


28.2. Adjustment of data 615 


The method of least squares 


Likelihood function. If m independent measurements a,, a2,...,a, are made to determine the 
n quantities y,, y2,.--, y, then the Gaussian law of errors applies to each error of observation 
Ey = Ay — Jy, Ez = Az — Y24--+5 En = An — Yn. With de; = da;,, 
p(a; — y,) da; = {I/[o; Y(27)]} exp [—(a, — y1)?/(207)] da, (i =1,..., 7) 
gives the probability that the observed value lies in the differential interval (a,;, a; + da,) or, for 
short, that the measured value of y, is a,;. Each standard deviation o; (¢ = 1, ...,m) depends on the 
precision of the corresponding measurement. By the multiplication law of the probability calculus, 
one calculates that the probability for the measured value of y, to be a, , and at the same time the 
measured value of y, to be a2, ..., and the measured value of y, to be a,, is given by 
a,-y; \? a3;—¥, \* an — Yn * 
ba a on ey (ze 


Os 


OF a a mV oy day dy 


This equation may be written in the simpler form P = L da, da, --- da, if L denotes the likelihood 
function; the expression S in it is usually called the sum of the squares of the errors, or simply the 
sum of the squares: 


L = [1//(2m))" (1/o1) - (1/o2) --- (/on) exp [— S/2] 
with S = 21 — y,)/o,)? = Zle/oid?. 


Gauss’s principle of least squares, the maximum likelihood principle. If the quantities y,, ..., y, 
are measured, then the measured values a,, ...,a, are known. The exact values of the quantities 
Vis +++, ¥n remain unknown. According to Gauss, estimations for the values of y,,..., ¥, appear 
plausi: ’+ if the measurements d,, ..., a, for them arise with the greatest probability. Hence y,, ..., ¥, 
are determined in such a way that the probability P is a maximum if the measuted values obtained 
are substituted for a,, ...,a,. The values for y;, ..., y, obtained in this way are therefore also called 
the most probable estimations for the quantities to be measured. If the probability P assumes a 
maximum, then the likelihood function Z must also be a maximum. This principle of estimation is 
therefore called the maximum likelihood principle and the estimated values for y,, ..., y, given by 
it are called maximum likelihood estimations. The likelihood function L assumes a maximum precisely 
if the sum S assumes a minimum. Thus, by the maximum likelihood principle the estimated values 
for the quantities y, , ..., y, to be measured are determined in such a way that the sum S of the squares 
of the errors is made a minimum. This is the method of least squares, which was developed by GAuss 
for the estimation of exact values from a set of observations containing errors. More precisely, 
one should say the method of the least sum of the squares of the errors. This method forms the 
basis of the entire calculus of data adjustment (or smoothing). By its application the errors of obser- 
vation are more or less smoothed out. 


The practice of the method of least squares. If the quantities y,, ..., y, are all different and if no 
relations exist between them, then the method of least squares leads to the solution y; = a; (i = 1, ..., ”), 
that is, each quantity is estimated by its single observed value and the sum S is then exactly equal 
to zero. There can be no adjustment of the observations. This case hardly ever occurs in practice. 
As a rule, either the quantities y,, ..., y, have the same value that is measured repeatedly, or there 
exist relations between them. In the latter case (which includes the former) there are fewer unknowns 
ty, tay -++y ty (K <n) in terms of which the quantities y,, ..., y, can be expressed, y,=f,(t1, ---, te), 
for example, y; = ciity + °°: + Cyt, (i = 1, ..., m). A representation is usually possible in the form 
of linear equations in which the coefficients c;. (@ = 1, ..., k) are known; for example, if a quantity 
is measured 7 times, then one has only a single unknown ¢, and the equations become y, = f, 
V2 = bee Va = 


Normal equations. If the number of unknowns is smaller than the number of measurements 7, 
then there are surplus measurements. In the sum S of the squares of the errors the quantities y, 
(i = 1, ..., 2) are also expressed in terms of the unknowns f¢,, ..., ¢;, and the partial derivatives of S 
with respect to these unknowns are formed. To determine the minimum of S these derivatives are 
put equal to zero. This produces a system of equations for f,, ..., t,, the so-called normal equations, 
and they are solved for the unknowns f,, ..., ¢,,. With these solutions the estimations for the measured 
quantities y,, --., ¥, are calculated. One usually writes the maximum likelihood estimations for the 
measured quantities in the form ,, f2,.--, Jn, to distinguish them from the unknown values 
Y1>)25 +++) ¥n- Likewise the value for ¢,,.-.,¢, given by the normal equations are denoted by 
f,,f,,...,¢,. For the unknowns f¢,,..., ¢, other more suitable letters are often chosen for the par- 
ticular problem. If the functions f,(t,, ..., ¢;,) are linear, then the normal equations also form a system 
of linear equations for ¢,, ..., t,. If, however, the functions f[(t,, ..., t,) are not linear, then the solu- 
tion of the normal equations can present considerable difficulties. It is then useful to Jinearize the 


616 28. Calculus of errors, adjustment of data, approximation theory 


problem. One first assumes rough approximate values N;,, Nz,,---. Ne, for t1,f2,-.-,l,. Let 
t= Ni,4+6t), t2 = NM, + 6f2, ..., te = Ni, + Ot,. One expands the functions fi(t,, ..., 4) in 
Taylor series and breaks them off after the linear terms: 
Offi « Of; : 
Fits, jie ty) = fi(Ni; teey N,,) + ai . Ot; + ns “Of, in oe Oty : Oty for i= 1, oe Me 
One now has only to determine the unknown corrections 6f,, ..., dt, by means of the method of 
least squares. 


Mean error and law of propagation of errors 


Individual measurement. The precision of a measurement is given by the standard deviation o 
appearing in the error distribution law. Instead of o the quantity A = 1/(o 2) is often introduced 
as a measure of precision. It was found from the Gaussian law of errors that the magnitude of the 
exact error of observation lies within an error bound of 0.6740 
with a probability of 50%, and is less than 1-:o with a proba- 
bility of 68.3%. In the adjustment of data o is called the mean 


error and 0.670, or more precisely 0.6740, the probable error. 


A number of measurements: If a; , a2, ..., a, are the measured values of the quantities y; , y2, ..., Yn 
and hf; = 1/(0; 2) is the measure of the precision of each measurement, then for the sum of the 
squares of the errors one has 


S= PAC — y/o)? = 22 h?(a, — yi)? = 22 hie; 


Weights of measurements. The individual errors ¢, do not contribute equally to the formation 
of the sum. In the formation of S a greater weight is attached to the square of the error if the mea- 
surement was dealt with greater precision 4, than to the square of the error of a less accurate mea- 
surement with smaller precision. One can attach directly to each individual measurement a weight 
P;, which states to what extent, in relation to the other measurements, its error of observation ¢; 
enters into the calculation of the sum of squares of the errors. These weights must be in the ratio 
of the squares of their corresponding measures of precision, that is, p; : Pz :°°: : Py =A? 2h3 2-2. 
As ratios, they are pure numbers and can be determined from the precision measures /, only up 
to an arbitrary constant factor. If one chooses this factor in such a way that the weight p = 1 is 
attached to the precision A, then p;: 1 = h?:h?, or h? = p,h? for i= 1, ..., n. For measurements 
of the same accuracy it is convenient to attach to each measurement the weight p; = 1 (i = 1,..., 7). 
Because the mean error o; of a measurement can be calculated from the precision /, by the formula 
0, = 1/(h; V2), ¢ = 1/(h V2) gives the mean error of an individual measurement with weight p = 1. 
Writing 4; = A Vp; one obtains o; = o///p;, i= 1, ...,n, and consequently from the mean error o 
of an individual measurement of weight 1 one can calculate the mean error of an individual mea- 
surement with arbitrary weight p;. Using the weight p; and the mean error o of an individual measure- 
ment of weight p = 1, the sum S of the squares of the errors takes the above form. 


The mean square error method with weighting coefficients h,; is often preferrable, if in different 
parts of a data series to be fitted different error conditions exist. Sometimes we cannot assume an 


additive superposition of the errors to the true values. If we are confronted rather with a multiplic- 
ative influence of perturbances it is useful to choose for the weights the expressions h,; = 1/y;. 


Standard deviation of a linear combination of absolute errors. For the Gaussian law of errors 
ge) = [1/0 Y(27)] exp [—(1/2) (€/o)?] 
the following three integrals hold: 
+00 +00 +0o 
(1) f p(e)de=1; (2) f eple)de =0; (3) f e2ple) de = 0?. 


(1) and (3) are dealt with in the theory of probability, and (2) follows because the integrand is an odd 
function. If an error of observation « is a linear combination of two independent individual errors 


28.2. Adjustment of data 617 


€, and €, of the forme = c,é; + c2&2, where c, and c, are constants, then using these three integrals 
one can show that the adjacent relation holds between the standard devia- [o? = cio? + cio} | 
tion o of € and the standard deviations o, and oc, of €, and €2, respectively. 

This result can be generalized: 


If an error of observation ¢ can be expressed as a linear combination of n independent individual 
errors 1), €2,4 +++4€, With standard deviations 01,02, ---,O, in the form &= Cy€, + C2&2 ++" + Cnn 
(where the c, are constants), then the standard deviation o of € is given by 


a? = cia? + cho} + --- + cRoz. 
Law of propagation of errors. If a function y = f(x,,...,x,) of the quantities x,,..., x, is to 
be calculated, and if only measured values a,,...,a, with exact errors €,,...,&, are available for 
X1, +++, X,, then the exact error « of y is given, up to quantities of higher order in the «;, by the 


expression 
) Te 
3 a Ey aes ao 


(see Calculus of ae fhe exact error € of the result can be expressed as a linear form in the ¢;. 
It follows from this, by the above considerations, that the standard deviation o of ¢ can be calculated 
from the standard deviations 0,,02,...,0, of the exact errors €,, €2,...,€, by Gauss’s law of 
propagation of errors. Because the standard deviation o of e corresponds to the mean error of the 
result and the standard deviations o,, 02,...,0, are the mean errors of the measured values 
1, 42,.---,@4,, Gauss’s law of propagation of errors gives the mean error of the result in terms of 
the mean errors of the initial data. 


E= 


This propagation law of the mean statistical error is not necessarily bound to the assumption, 
that the input variables are normally distributed. It also holds under the condition, that the input 
variables possess arbitrary random distributions, but they must be statistically independent from 
each other. 

Mean error of mean value. The law of propagation of errors assumes a particularly simple form 
in the case when the function is the mean value of the quantities x; ,...,x,, y= X= (x, +x2+---+x,)/n. 


e 1 
Because of = Es , one obtains of =z (o? + of +--. + G2), 
Ox; n n 


If the measured values a,, a2, ...,a, of the quantities x,, x2, ...,x, have the same precision, then 
o? = of = --- = o? = o2 and one obtains 


oz =0,/n, 0g =0,/yn. 


‘The mean error of the mean of m measurements made with equal precision is equal to the mean error 
of an individual measurement divided by yn. 

Estimation of the mean error eau the observations. In general, the mean errors oa, of the individual 
measurements in a measuring process are not known, but only the weight p; to be ascribed to the 
measurements @,, ..., @, Of the quantities y,, ..., y,. One is then faced with the problem of estimat- 
ing the mean errors of the individual measurements from the available observed values @,, .--, Qn. 
Because one can determine the mean error o; of an individual measurement of weight P from the 
mean error o of a measurement of weight p = 1 by means of the formula o, = o///p;, it is sufficient 
to estimate the mean error o. This estimation is denoted by m. 

If the quantities y,, ..., ¥, to be measured can be expressed in terms of exactly k <n unknowns 


ti, to, ---,¢, then an estimate m for o can be derived by the methods of mathematical statistics. 


The quantities p; (i = 1, ..., 2) are the maximum likelihood estimations of the quantities y;, ---, Yn 
to be measured. 


In the adjustment it is customary to denote the mean error of the individual measurement of weight 
p = 1 by m itself, and the probable error by 0.674 m, although m is only an estimation for o, that 
can itself be subject to random variations. These variations can be considerable, especially for a 
small number 2 of observations. The statements about the bounds of the exact error of observ- 


618 28. Calculus of errors, adjustment of data, approximation theory 


ation resulting from the Gaussian law of errors are therefore only approximately correct if one 
uses the estimate m for the mean error in place of 0. Mathematical statistics show how one can 
arrive at exact bounds for the exact error of observations with the help of m. To characterize the 
accuracy of measurements one must always state the (estimated) mean error m of the individual 
observation of weight p = 1. From m one can obtain the mean error of an individual measurement 
of weight p;, using the adjacent formula, which corresponds to the formula o,; = o///p;. With the 


help of Gauss’s law of propagation of errors one is in a position to calculate also the mean error 
of each quantity formed from the observed values a,,...,a,. This arises, in particular, for the 
maximum likelihood estimation of the quantities y,,..., y, to be measured. If required, bounds 
for the exact error of observation, that will not be exceeded with prescribed probabilities, can be 
obtained from these mean errors. 


Adjustment of direct measurements 


A quantity y is measured directly m times with the same precision. Let the measured values be 
a1, 42,...,d,. The single unknown in this measuring process is y (k = 1). Hence the equations 
V1 = Ys V2 = Jy +++) Yn = Y hold. For individual measurements of equal precision these measurements 


have the same weight p, = p2 = -:- = p, = 1. The sum of squares of the errors and its derivative 
with respect to the unknown y are given by 
n dS n 
S = (1/07) PAC = y)rs ay (2/07) (4 = Vs 


Equating this derivative to zero one obtains the normal equation pXC — y) =0, whose solution 
for y gives the estimation 7. 


The mean value of the individual measurements serves as an estimation for the quantity to be 
measured, | 


The approximation for the quantity y to be measured is given in the form y ¥ ~(+m;)ory=j+ms; 
the mean error ms, of the estimation is calculated from the mean error m of the individual measure- 
ment by the relation m; = m/|/n. 


Example: Each of five school children measures the length y of 
the edge of a model cube. The estimation obtained from the results 
of the measurements (see table) is 


y = ((12.2 + 12.1 + 12.5 + 12.3 + 12.4)/5] in. = 12.30 in. eee 12.3in 
The mean error of the individual measurement is |} as = 12.4 in. 


eb {{U2. 2— 12.3)? + (12.1 — 12.3)? +- (12.5 — 12.3)? + (12.3 — 12.3)?+ (12.4 — 12.3)?)/4} in., 
poi 158 in. 


This gives for } a mean error m; = 0, 158///5 = 0,071 in. and consequently the approximation 
for the | length of the edge of the cube is y = 12.3 in. (+0.071 in.), or y = (12.30 + 0.07) in. 


Results of measurement 
a, = 12.2 in. 
ad, = 12.) in. 
dad, = 12.5 in. 


Adjustment of direct measurements of unequal precision. A quantity y is measured directly m times 
Let the measured value be a,, a@2,...,@,. Because of the unequal precision of the measurements 
they must be taken with their weights p,, p2, ---, P,. The single unknown in this measuring process 
is y (k = 1) and the relations y, = y, y2 = y, ---) ¥n = hold. If one forms the sum S of the squares 
of the errors, differentiates it with respect to y, and equates the derivative to zero, one obtains the 
normal equations 


28.2. Adjustment of data 619 


dS 


S= (I/o*) X pila, — y)? —> ae —(Q2/o*) 2 pila; — y) = 0 oe Pla, — y) =0, 
from which one finds the estimation y. 


eighted mean of the individual measurements with weights p, serves as an estimation for the 


The mean error m of an individual measurement of weight p = 1 is needed to calculate easily 
the mean error of other relevant quantities. From it, for example, the mean error m,; = m/Jp; can 
be found of the individual measurement a; of weight p;. Further, it can be used to calculate the mean 
error m; for the estimation » according to the law of propagation of errors. Because 


OY —___?t___ it follows that 


nV slo [(0)]}—Vdlern|(E>)} 


The approximate value of the quantity to be measured can 
be given in the form y= j(+m;) or y= 9 + m;. 


Example: A length / is measured five times, and afterwards three more 
times with greater precision. Because of the more accurate measuring proce- 
dure the measurements dg, a7,@ 3 (see table) of the second group must be 
given five times the weight of the first group. Hence p, = p) = p3 = ps = ps = 1 
and ps = p7 = pg = 5. The estimation / for / is given by / = 12.34in. = 
*/2o (1 + (12.35 +- 12.404 12.25 + 12.30-+ 12.35) + 5-(12.37 + 12.32 + 12.34)] in. 
Since [0.017 + 0.067 + 0.097 + 0.047 + 0.01? + 5-(0.03? + 0.02? + 07)] — 1232in 
= 0.0200 and )/(0.0200/7) = 0.0535 the mean error m of the individual | %7 a 12.34 in. 
measurement has the calculated value m = 0.0535 in. Because p, = p> <o = 
= P3= Pa = Ps = |, this is at the same time the mean error of the measurement of the first group, 
m, = 0.0535in. The mean error of the measurement of the second group is given by 
mz, = m/\/5 in. = 0.0239 in. 

The mean error of the estimated value is mj} = m/j/20 in. = 0.0120 in. Hence from the mea- 
surements the approximate value of the required length / is calculated to be / = 12.34 in, (+0.01 in.). 


If the mean errors o,; for measurements a, , a2, ..., @, are themselves known instead of the weights 

P;, then the weights p; can be determined up to a constant factor by the ratios 
Pi: P22 +++! Pn = (1/02) : (1/03): ... : (1/o?). 

This factor is chosen so that either the p,,...,p, are easily manageable numbers, or that 
Pi + P2 +++: + Pa = 1. Denoting the arbitrary factor by A?, so that p, = A?/o?, the mean error 
of an observation of weight p = 1 is o = o,; Vp; = A. Hence if m is calculated from the observed 
values with the prescribed weights, the result must be approximately equal to A, since m is an estima- 
tion for o. If there is a very large difference between m and A, then it can be concluded that systematic 
errors occur in some measurements. 


a, = 12.35 in. 
a, = 12.40 in. 
a, => 12.25 in. 
a4 = 12.30 in. 
as = 12.35 in. 
ag = 12.37 in. 


Adjustment of observations 


Conditional observations. The angles «, 8, y of a triangle are to be determined by measurement. 
Each angle is measured repeatedly. The measurements of the angle « give the values a,, a>, ..-, Gn, 
(nm, measurements). For the angle 8 the measured values are b,, b2, ..., bn, (nz Measurements) and 
for the angle y they are c,, C2, ..., Cn, (#3 measurements). Altogether n = n, + nz + n3 measurements 
are carried out. The individual measurements are made with equal precision. Again denoting the 
exact values of the quantities to be determined by measurement by y,, y2, ---, Yn, (measurements 
Of &), Vnitis Ym42s +++» Ymtne (Measurements of B), Vaiinstis Vntnet2>-+> ¥mtne¢n, (Measurements of 
y), then these quantities can be expressed in terms of the unknowns «a, B, y by 


V1 = 2 = 9 = Vay =a, 
Ym4+1 = Yny+2 ett: = Ynyine = B, 
Maytnat1 — Ymytnst2 = °° = Vmtmins — V- 


620 28. Calculus of errors, adjustment of data, approximation theory 


The sum of the squares of the errors is 

n n n 

S= = | Ba — 0 + FG —- BP + Fa —y?), 

O° [i=1 i=1 k=1 
However, the method of least squares cannot be applied immediately, because the unknowns 
x, B, y are subject to a condition. The sum of the angles of a triangle is 180°. Hence «, 8, y must satisfy 
the equation of condition (x, B,y) = « + B + y — 180° = 0. The adjustment of thé observations 
must be achieved, taking this condition into account. In such a case one speaks of the adjustment 
of conditional observations. Strictly speaking, not three, but only two unknowns occur here. For if 
a« and B have been determined, the value of y follows from the conditional equation. One can deal 
with such a case in two ways. One can either use the conditional equation 9(«, B, vy) = 0 to express 
one angle, say y, in terms of the other two, substitute this expression for y in the sum S and then 
apply the method of least squares, or one can determine directly the minimum of S subject to the 
side condition g(a, 8B, y) = 0. To do this one makes use of the method of Lagrangian multipliers 
(see Chapter 19.4. — Extreme values of functions of several variables); that is, one determines «, 
B, y and A so that the expression 


1 n n n 
T= S + g(a, B,Y) = = Px — 9 + 3 by — BY + 2 — ”»?| + Mo + B+ y — 180°) 


assumes a minimum. The quantity A is the Lagrangian multiplier belonging to the side condition. 
The normal equation are then 


oT 2m oT 2 Ms 

oe tek ee Grae Te 
oT 2 Mm oT ° 
ee ee Gen ee 


The solutions of these normal equations are 

& = a— K/n,; B = b— K/nz; ~»=C— K/n3; A = 2K/o? 
with the correction K = (4+ b+ &— 180°)/(1/m, + 1/nz + 1/n3). The mean error of the in- 
dividual measurements is calculated, as usual, from 


m? =| F(a, — 8)? + Sey — BY + Se, — 9°] [Mn — 2) 
= j= = 
The mean errors of the estimations &, 8, / are given by the law of propagation of errors by 
mz = (m/n,) Vin — 1/1 /ny + 1/n2 + 1/n3)); 


mg = (m/n2) V{nz — 1/1 /nm, + 1/n2 + 1/n3)); 
my = (m/n3) V[n3 — 1/(1/n, + 1/n2 + 1/n3)). 


Example: The angles x, 8, y are measured four, three, four times, respectively, nm, = 4, m2 = 3, 
n3; = 4 (see table), and the mean values 4, 5, and é are calculated. Because 1/(1/m, + 1/mz+-1/n3) 
= ].2and a+ 5 + ¢— 180° = 3”, the correction K is given by K = 3” - 1.2 = 3.6”, and hence 
the estimations 4 = a — K/4, 8 = b — K/3 and ? = é — K/4 can be calculated. 


Angle « 

a, = 6271714" ; . 

a, = 62°17'11" J | +3. c; = 44°22'26" 
a, => 62°17'16" tt | 0.8 c= 44°22'92"" 
a4 => 62°17°15”" i | 4= 44°22'23"' 


a = 62°17'14" b = 73°20'25" ¥ 
& pal a B = 73°20'23.8” 


ec = 44°22'24" 
p = 44°22'23.1’ 


The mean error of an individual measurement is 
m= V{[(0.9? +- 2.17-+- 2.97 -+ 1.97) + (1.2?-+ 3.2?-+ 0.87) + (1.97 2,97-+ 1.17-+ 0.179)" = 2,181". 
From this the mean errors 
ms = m-0.418 = 0.912; = mj = m- 0.447 = 0.975"; = ms = m- 0.418 = 0.912” 
are obtained. The approximate values for the three angles calculated from the measurements are 
a % 62°17'13.1°(40.91"); B ~ 73°20'23.8°(40.97"); yp © 44°22'23.1°(+0.91"). 


The method described in the example of the measurement of the angles of a triangle can be applied 
quite generally for the adjustment of conditional observations. If the quantities y1, ¥2,---» Yn to 
be measured can be expressed in terms of the k unknowns ¢,, ..., ¢,, and if these unknowns satisfy r 


28.2. Adjustment of data 621 


independent conditional equations 
Pilly, ---5 fe) = 90; Prlti,---,%&) = 0; «5 Ot, ---5 &) = 9,~7 


then the unknowns f¢,, ..., 4, and the Lagrangian multipliers A,,...,A, are determined so that the 
expression 


T=S+ AiG i(t1, sey ty) aida A(t “ey t) 


has a minimum. The remaining steps in the calculation again correspond completely to the method 
of least squares. Because of the r conditional equations there are actually not k unknowns, but 
only & — r unknowns to determine from the observations. This effective number of unknowns is 
to be taken with weight p = | in the calculation of the mean error m of the individual measurements. 


Adjustment of indirect observations. Often the quantities to be determined are not accessible for 
direct measurement; for example, to determine the density of a body its weight G and its volume V 
are measured; G and V are indirect observations for the determination of the density. In the sense 
of the adjustment of data, the interest in indirect observations lies not so much in finding the exact 
values of the measured quantities y; , y2, ---, Yn, but in the determination of the unknowns f,, fz, -.., ty; 
in terms of which y,, ..., y, can be represented. The method of least squares again yields the estima- 
tions f,, 2, .- +s by for the unknowns. From the mean error m of the individual observation, the mean 
error of the estimations t,,t>,---, &, can be calculated, using the law of propagation of errors. 


Example: A ring consists of a silver-gold-alloy. To determine 
the weight of gold and silver the ring is weighed a number of “|[—.. ; a 
times with a Jolly spring balance, in air G and in water W (see 2 
table). Letting g, denote the weight of gold, g, that of silver and 
o, and o, the densities of gold and silver, respectively: 

G = g, + 8&2; W = (1 — 1/0.) 1 + (1 — 1/02) 82. 
The sum of the squares of the errors is 


= $= (H/o%)| 3 (ai — 8&1 — 8&2)" + 2b — (0, — 1) gi/e1 — (e2 — 1) gales? 


This leads to the normal equations: 
roa 3 
ee = —(2/0*) B (a; — £1 — 8&2) + [b, — (0, — 1) g1/0, — (2 — 1) g2/02) (0, — D/es| =0, 
. oe 3 
Jer = — Jo?) 3a, — #1 — #2) + 3 lb, — Cs — Dales — (2 ~ Dazlea(@r — Die] =0. 


The estimations given by these are . ‘ 
81 = —do, - (02 — 1)/(@, — @2) + be,02/(0: —@2); a= ‘ls 2 ay; = ‘Is & by; 
L= +-do2* (1 — 1)/(@: — @2) — bore2/(or — @2); G = & + br; 


= (1 — I/o,) 2, + (1 — 1/e2) &2. 
The mean error of the individual i laa is 
m= MPXC — a)? + Eb) — bP |/e “ 2). 
i ie (02 — 1)/(0, — @2) and Lae 02/(0; — @2), the law of 
aa, — 4°01 ° (02 01 02 aby 3° 0102/(01 2), : 
propagation of errors gives 


mg, = m \ (1/4) - ef (o2 — 1)*Hes = @2)*) + (1/3) * ef03/(e1 — 02)? }; 

mg, = m \[(1/4) - o3(e, — 1)7/(@, — @2)* + (1/3) * efe3/(e1 — @2)*). 
The numerical calculation with 9, = 19.3 g/cm? and @, = 10.5 g/cm? gives: 

4=40lg; b=3.7lg:; #,=1898; &=2.128; 

m=0.02g; mg, = 16.89m=0.34g8; mg, = 17.20m = 0.34 g. 
The weights given by the measurements are 

g, = 1.89 2(4+0.34g) and gz = 2.12 g(40.342). 


Because 


Adjustment of relations 


The relation between a quantity y and an independent variable x on which it depends is often 
given in the form of a linear equation y = « + Bx. The quantity y is observed for different values 


622 28. Calculus of errors, adjustment of data, approximation theory 


X15 X25 +++) Xp, Of the variable x. Hence the exact values of the quantities to be determined by the 
individual measurements are 
y=aetBxy, G=1,...,n). 
Let the measured values again be a,,42,...,@,, and let p,, p2,---, Pn be the weights associated 
with them. Because the values x,, .-., X, of the variable x are given, « and # are the only unknowns 
in this measuring process. The estimation for « and f can be made by adjustment of the data. As 
in the case of indirect observations, the interest lies not so much in finding the exact values y,, 
but in determining the unknowns « and #. Because the unknowns occur as constants in a linear 
equation, one speaks of an estimation of constants. The quantity is called a coefficient of regression. 
In applying the method of least squares one again begins with the sum of squares of the errors, 


S = (1/0?) S pila — ys) = Jo?) & pila — a — Bx) 


Equating to zero the partial derivatives of S with respect to the unknowns « and B leads to the 
normal equations 


0S 


n n 

ae — (2/07) 2 pila,;—«— Bx)=0; z= —Q/o*) & pila — «% — Bx,)-x,=0. 
{= = 

In a notation due to GAuss the normal equations 

assume the form: 


o[p] + Bl[px] = [pa], oa[px] + B[px?] = [pax]. | Notion due Gross | F51= 0 | 


From these equations the estimations & and # for the unknowns « and f can be found. From these 
one can also calculate the estimations ~; = & + px, (i=1,..., n) for the exact values of the quantities 
Yi» V25 +++) Yn» The mean error for the individual measurements of weight p = 1 is given by 


m= Viz pila, — 5 | (n— 2», 


because two unknowns are to be determined from the observations. Then m, = m//p; is the mean 
error of a measurement of weight p;. With the help of the law of propagation of errors one cal- 


culates : , 
[p] [px] = | [px*] | 


1 
ma I" Ty |v 1+ Ty ipx?] — [px?? [p] Lox?) — [px]? 
[P] . 


et il [Pp] [px?] — [px]? 


Example: For ten plantations of pine trees 
of different ages the mean diameter x of the 
tree trunks and the mean height y are deter- 
mined. The measured values, arranged in 
order of magnitude with respect to x (see 
table) have the same precision (p, = 1; 
i= 1,2,..., 10). A graphical plotting 


Measured values, say 
x in centimetres and 
|y in metres 


28.2-1 The mean 
height plotted 
against the mean 
diameter for ten 
pine plantations. 
Line of regression 


1 
2 
3 
4 
5 
6 
7 
8 
9 
0 


j= 
3.837 + 0.4888x 


28.2. Adjustment of data 623 


of the connection between diameter and height gives the representation of the dependence of 
y upon x (Fig.). This relation is to be described by means of a linear equation y = « + fx. 
From the measured values one calculates 
[pP]}= 10; [px] = 257.0; [pa] = 164.0; 
[px?] = 7430.24; [pxa] = 4618.26. 
It follows that 
= (4618.26 - 10 — 164.0 - 257.0)/(7 430.24 - 10 — 257.07) = 0.4888; 
& = 1640/10 — f - 257.0/10 = 3.837. 


The line of regression ~ = 3.837 + 0.4888x is drawn in the figure. Further calculations give 
PS pa, — },)? = 5.1522, and the mean error of an individual measurement is m = 0.8025. 
From this the mean errors for & and B are found to be mz; = 0.7614 and mj = 0.0279. 


Adjustment of non-linear relations. A quantity » may depend linearly on several variables 
Zo. Z15 225 +++) 2. Lhe representation for y then has the form 


y = BoZo + Bi2z1 + Boz2 + °°: + Buze- 
To determine the regression coefficients Bp, B;, B82, ---, By, the quality y is measured for different 


values Zo;, Z1i5 Z2is «++» Zei (i = 1, ..., 2) Of the variables on which it depends. The exact values of 
the quantities to be determined by the individual measurements are then 


yi = Bozo: + B12; + B2z2i + ° + BuZ«1 (= 1, «+, 7). 
Let the measured values be a,, a2, ..., @,, and let p;, P2, ---, Pn be the weights associated with them. 
If the number of measurements n is greater than the number of unknowns & + 1, then there are 


again surplus measurements available. The unknowns fo, £1, 82, ---, By can be determined by the 
method of adjustment of data. Starting from the sum of the squares of the errors 


S= (1/0?) & pula — BoZo:1 — Biz1i — * — B21)? 


one obtains the normal equations 


Bolpzé] + B:Lp20Z1] + B2[pZ0z2] + --- + B,Lpz0z%] = [pazo] 
Polp2120] a Prlpzil oe P2Lpz122] a medi By. Lp212«] = [paz;] 


Bolpz.zo] + B; [pz ,21] =“ Bolpz4z2] over pb B,lpz?] = [paz,]. 
One solves these equations for the unknowns and finds the estimations Bo, Bi, ..., By. With these 
one can calculate the estimations $,; = Bozo; + -*- + Byzqi (i= 1, .--, m) for the exact values of the 


measured quantities y;, y2, ---, ¥,- The mean error of the individual measurement of weight p = 1 
is given by 


n= | {los feo} 


With the help of m one can find in the usual way the mean error of an individual measurement of 
weight p,; and, by the law of propagation of errors, the mean error of the estimations Bo ee Bx - 
If the quantity y can be expressed as a polynomial in x, 


y =Bot Bix + Box? oes + B,x* 


then a non-linear relation exists between y and the variable x on which it depends. This is a special 
case of the linear dependence on several variables dealt with above. One need only put 

=i, 2=—x, BS XO cay pe SS, 

Zoi = 1, Zag = Xi, Zap = Py ey Ze = XG (i= 1,..., ”) 
and the determination of Bo, ..., 8, and of the mean error can proceed according to the formulae 
already given. 


Example; In pine plantations with different mean trunk diameters d at a height of 1.30 m above 
the ground the average amount of wood FV in each trunk is observed (see table). The determination 
of the amount of wood FV, is carried out with equal precision. The relation between V and d is 
to be adjusted by means of a cubical parabola. To simplify the calculation the variable 
x = (d — 30)/5 is introduced as a new independent variable. The quantity V is to be represented 
by the cubic equation V = fp + B,x + Bx? + B3x°, whose coefficients By, 8,, 82,83 are to 
be determined by adjustment of the observed data. The sum of the squares of the errors Is 


9 
s= (I/o*) 2 V,— Bo — Bix, — Bax? — B3x})?. 


624 28. Calculus of errors, adjustment of data, approximation theory 


The normal equations have the form Table of the 
9+ Bo + [x x] Bi + [x!] Ba + [°] 83 = [VY], 
lie i i 18; + (x?) 82 + [x*] 83 = [XV], 

[x?] [x] By + [x] Bz + [x°] Bs = [x?V], 

be} Bo i ore [x*] B2 + [x9] Bs = [x ¥). 

From the observed data one calculates 
ts ios =0, [x?7]= soos [x*]=0, [x*] = 708, 

— 0, [x*] = =. 9 i] 
“y— = 8.382, ae = 19.058, 
[x*V] = 69.090, [xv] = 227.456. 
The solutions of the normal equations are given by 
Bo = 0.64540, f, = 0.296348, 
B. = 0. 042890, em = 0.0018039. 

Consequently the parabolic regression curve has the equation 
V = 0.64540 + 0.296348[(d — 30)/5] + 0.042890[(d — 30)/5]* +- 0.001 8039[(d — 30)/5)]?. 

The adjusted values V’; are shown in the table. 


Representation of a function with the help of simpler functions 


The method of least squares is applied in mathematics not only for the adjustment of obser- 
vations. If one wishes, for example, to find an approximation to a complicated function y = f(x) 
in terms of simpler functions ¢o(x), 1(x), ---> P(x), One can determine the coefficients Bo, B,, ... 
in the linear expression y = f(x) = BoWo(x) + Bipy(x) + --- + Byp,(x) by the method of least 
squares. 

If the function y = f(x) is known only at the discrete points x; (i = 1, ....n;n >k), y; =f(x), 
then one begins with the sum of the squares of the deviations 


. = 5b — BoPol(xi) — Bigi%i) — ++ — Bap (x). 


If, on the other hand, the whole course of the function y = f(x) is known in an interval a < x < b, 
then one begins with the integral of the square of the deviation 


S = | 6) — Bopo(x) — Brpulx) — -- — Ayp2dF? dex. 
In both cases the Gocmncient: Bo, ---> By are determined i in such a way that S assumes a minimum. 
Denoting the sum z (Xi) My_(Xi), OF the integral i P(X) P(x) dx, as the case may be, by [py] and 
the sum » y(x;) oe) or the integral f y(X) (x) ‘dx by [yy,], one obtains the normal equations in 


the form 


Bolgs]) +8: ie + BalpoP2] + -:- + Blvog.] = Leo), 
Polvo] + By [pi] “r Palos] ea PelPs?el = = Lyprl, 


Bolp.vol + Bi pera] = Balo.va] 2 ls B92] = Lg). 


From this system of linear equations Bo, ..., 8, are to be calculated. The solution of the normal 
equations takes a particularly simple form if [y,,] has the value 0 for i= k, and 1 for i = k. The 
solutions are then 8; = [yg,]. This case occurs if the functions y,(x) form an orthonormal (normed 
orthogonal) system of functions. The best known systems of orthogonal functions are the trigono- 
metric functions sin ny, cos np (n = 0, 1, 2, 3, ...) and the Legendre polynomials. 


28.3. Approximation theory 


Every calculation with approximate values can be said to belong to approximation theory. In 
the stricter sense, however, one understands by approximation theory certain mathematical proced- 
ures that make it possible to replace complicated calculations by simpler ones. One must accept the 
fact that in this way one obtains not exact solutions, but only approximate ones. On the one hand, 
these approximation methods serve to save work in calculation. On the other hand, they enable one 
to obtain numerical solutions for very many mathematical problems by means of approximate 
methods alone; to obtain, for example, the numerical value of a given integral -that cannot be ex- 


28.3. Approximation theory 625 


pressed in a closed form, one must resort to approximate methods of integration. Approximation 
procedures have been worked out for widely different types of mathematical problems. Each ap- 
proximation method must allow an estimation of the error perpetrated by its application. 


Approximation methods for calculating the values of a function 


The whole of numerical calculation takes place exclusively in the field of the four basic operations 
addition, subtraction, multiplication and division. If a more complicated mathematical function 
f(x) is to be calculated, one must express it in such a way that only these four basic operations have 
to be applied. This is usually achieved by means of power series expansions (see Chapter 21.). 


Asymptotic representations for large values of the argument. If one has to calculate values of a 
function F(x) for very large values of the argument, one possibility is to obtain an approximation 
formula by substituting z = 1/x in the function, and expanding the function f(z) = F(1/z) in a 
Taylor series in z = 1/x. Because z is very small, the expansion can, in general, be restricted to a 
few terms. Another possibility is to determine for F(x) an asymptotic representation or asymptotic 
expansion. A function ¢(x) is said to be an asymptotic representation of F(x) if lim [F(x) — ¢(x)] = 0. 


x—> CO 
One then writes F(x) ~ 9(x). If one sets F(x) = 9(x) + R(x), the remainder term R(x) must satisfy 
lim R(x) = 0. A function (x) is often also called an asymptotic representation of F(x), F(x) ~ 9(x), 


if F(x) = @(x)-[1 + r(x)], where lim r(x) = 0, that is, if the quotient F(x)/p(x) tends to 1 for large 
values of x. x= 00 


Asymptotic representation of the Gaussian error integral. By two partial integrations one obtains 
for the Gaussian error integral the representation 


P(x) = L/V2z)] f exp [—#7/2] dt = 1 — L/VQz)] f exp [—1?/2] de 
1 — [1/V(22)] {(1/x) exp [—x?/2] — fap) exp [—?7/2] de} 


1 — [1/V(2m)] {(1/x) exp [—x?/2] — (1/x3) exp [—x?/2] + f (3/t*) exp [—t?/2] dé}. 


With the first three terms one has already found a good asymptotic representation 
D(x) ~ 1 — [1///(2%)] {C1/x) exp [—x?/2] — (1/x3) exp [—x?/2]}. The remainder term 


R(x) = —[1/V@7)] G/t4) exp [—1t?/2] dt can be estimated from 


|R3(x)| < [1/V(2)] - (3/x°) J t exp [—#7/2] dt = [1//(2)] - (3/x*) exp [—x?/2]. 
It tends to zero as x — 00. Even for x = 2 the error of the asymptotic formula is <5: 1073. 


The Euler sum formula. If the function F(x) for which an asymptotic representation is required 
can be represented as a sum F(x) = f(1) + f(2) + --- + f(x — 1) + f(x), where f(z) is a given 
function and x a positive integer, then an asymptotic representation can be obtained from the Euler 
sum formula. 


In this formula the B,, are the Bernoulli numbers of which the first few are B, = 1/6, Bg = —1/30, 
Bs = 1/42, Bg = —1/30, Byo = 5/66, By2 = —691/2730 (see Chapter 21.). The remainder R,(x) 


can be estimated by |R,(x)| < [4/(27)?"] f |f?™(o)| dt. If R,(x) tends to zero as x—» 00, then by 
i 


neglecting the remainder in the Euler sum formula one has already found an asymptotic represen- 
tation for F(x). If, however, R,(x) tends to a limit C,, as x — oo, then an asymptotic representation is 
obtained by replacing R,(x) by the limit C,, in the Euler sum formula. 


Asymptotic representation for the factorial function x! Taking the natural logarithm of x! one has 
F(x) = In x! = In 1+ 1n2-+--- + 1n «x. Retaining only the terms up to the first derivative (7 = 1) 
in the Euler sum formula one obtains 


F(x) = fin t dt + (1/2) (In x + In 1) + (1/6): (1/2) (/x — 1) + Ryo) 
= xInx — x + In x/2 + 1/A2x) + 1 — 1/12 + Ry). 


626 28. Calculus of errors, adjustment of data, approximation theory 


The remainder R;(x) can be estimated by |R,(x)| < (1/a?) (1 — 1/x?). A more precise investigation 
gives lim R(x) = C, = 1/12 — 1+ In (2m). Replacing R,(x) by this limit in the Euler sum 


formula, « one obtains the asymptotic representation 
F(x) & (x + 1/2) In x — x + 1/(12x) 4+ In (22). 
From this it follows that 
x! = eFO w (2x) x* exp [—x + 1/(12x)]. 


For very large values of x the term 1/(12x) in the . —_— 
exponential can be neglected, and then Stirling’s [ Stirling’s formula] x! y(2nx) x*e~* | 
formula is obtained. 


Approximation of functions by means of polynomials 


If for a function y = f(x) the function values at the arguments x = Xo, X = X1,..., X = x, are 
known, these points are called basic points and the corresponding function values yo = f(x9), 
Yr =f (%1), «++ Yn =f (Xn) are called basic values. The problem consists in calculating the function 


value y = f(x) for an arbitrary value of x lying between two adjacent basic points. If the exact 
determination of y = f(x) involves very extensive calculations, then one tries to calculate the required 
function value y approximately from the known function values Yo: Vis-++s¥ny» AS One says, to 
determine it by interpolation. 


Linear interpolation. One becomes acquainted with the 
simplest interpolation procedure in working with angle func- 
tions or logarithms, when values are to be determined that 
lie between those given in the table. The values given in the 
table are then the basic values and the required value is 
usually found by /inear interpolation. For this one needs 
only two basic points xo and x, with the basic values yo 
= f (xo), ¥1 = f(x). The value y = f(x) is required, where 
Xo <x < x,. By linear interpolation one finds from the 
ratio equation (y — yo)/(x — xo) = (1 — Yo)/(x; — Xo) the 
value y = yo + (x — Xo) (¥1 — Yo)/(x1 — Xo). Thus, one 
replaces the arc of the function y = f(x) in the interval 
(xo, X1) by a straight line approximation that passes through 
the points (x9, Yo) and (x,, y;) (Fig.). 


Interpolation in the wider sense. In general, all interpola- 
tion methods consist in replacing the function y = f(x) in 
a neighbourhood of the basic points x9, x;, ..., X, by simp- 
ler functions that are best possible approximations to the 
function y = f(x) in this neighbourhood. One method of 
determining such approximation functions is the method of 
least squares. Fourier analysis and smoothing of data, among 
other procedures, rest upon this. They can be applied if fewer parameters occur than available basic 
points and have to be determined for the specification of the approximation function. The approxi- 
mation functions determined in this way do not, in general, exactly pass through the known 
basic values (xo, Yo)s (x; ’ V1); mesg (Xn, Yn). 


Interpolation in the stricter sense. In what follows interpolation methods are considered in which 
the approximation functions for y = f(x) assume at the basic points x9, x1, ---, Xn exactly the basic 
values yo, ¥1,---,¥n- Because polynomials are the simplest functions available, one tries to find 
a polynomial approximation to the function y = f(x) in a neighbourhood of xo, X15 ---,X,- From 
algebra it is well known that exactly one polynomial P,(x) of degree n passes through n - 1 given 
points (x9, Yo), (X15 ¥1)> +++) (Xn» Yn)» This polynomial is chosen as an approximation function for 
y = f(x). There are various methods for determining it. They all lead to the same polynomial 
P,(x); its external form varies, but.this is the only difference between the different interpolation 
methods. f 

Polynomial form. If the polynomial is written in the 5 weer et ae ee n 
form "P,(x0) = Ag + yx + agx? +++» + a,x" with un- ha ei ea Mar res the ae 
determined coefficients ag, a,,...,@, and with the condi- ae m B tie ch a2%1 | ane 
tion that it passes through the points (x9, ¥o), (X,Y), «+5 
(Xn» Yn), then the adjacent equations must be satisfied. 
These are n + 1 equations for the determination of ag, a,, ..., @,. They have a unique solution if the 
basic points x9, X15 ---» X, are all distinct. 


28.3-1 Linear interpolation between 
two basic points 


Yn = Aq + yXq + 42Xq +o + AnXh 


28.3. Approximation theory 627 


Example: The function y= x is to be approximated by | 1.9 — 

a polynomial of degree 2 that passes through the points | |] — a + 1.2la, + 1.464la, 

(xo =1, yo= 1), (4 = 1.21, »y, = 1.1) and (x2 = 1.44, 1.2 = do + 1.44a; + 2.0736a,_ 

y2 = 1.2). Let P(x) = a9 + a,x + a,x”. From the adja- 

cent equations one finds that a9 = 0.4099, a, = 0.6842, a, = —0.0941. Substitution of the 

value x = 1.3 in the interpolation polynomial y = |x ~ 0.4099 + 0.6842x — 0.0941x? gives 

V1.3 = 1.1403. The exact value for //1.3 is 1.140 175. 

This is a simple example of an inverse interpolation in a table of squares. Although the form 
assumed in this method is very simple, the final determination of the interpolation polynomial 
requires a considerable amount of calculation, especially if a large number of basic points is to be 
taken into account. For this reason LAGRANGE and NEWTON chose the form of the polynomial 
P,({x) rather differently and hence arrived at formulae that are simpler for calculation. 


do + a : + a2 


Lagrange’s interpolation polynomial. LAGRANGE begins by assuming the form 


P(x) = Lo(x) yo + Lie) 1 +s + La) yn 


for the approximating polynomial, in which the coefficients L;(x) of the basic values y; are polyno- 
mials in x of degree n. They are calculated from the basic points x, (j= 0,1, ...,) alone and take 
the values L,(x;) at these points. The polynomial approximation P,(x) certainly passes exactly 
through the points (xo, Yo), (41, ¥1)s «+> (Xn» Yn) if the polynomials L;(x) can be determined in such 
a way that L,(x,) has the value 1 for i = j, and 0 for i+. The polynomials defined by Lagrange 
satisfy this condition. 


If one of the values x9, X1, ---, Xi-15 Xi415 +++) Xn IS Substituted for x, then there is always just 
one factor of the numerator that vanishes; but for x = x, the numerator is equal to the denominator. 
Introducing these polynomials into the form assumed for P,(x) one obtains Lagrange’s interpolation 
polynomial. 


Example: The parabolic approxi- (x— x) «— x2) — &—1.21I)@— 144) 


Lo(x) = 


mation for the function y = |x, is —x0Go—a) 0.0924 
passing through the points (xo = I, “A a vt ei 
om Ds Cy 12 = 10). Gy |Lio)— ee Se, ee 
= 1,44, y, = 1.2) is to be determin- (xy — Xo) (X41 — X2) — 0.0483 

ed with the help of Lagrange’s inter- | <= te Folie — (x — 1.0) (x — 1.21) 
polation method. From the adjacent L(x) = = 0.1012 


(2 — Xo) 2 — 1) 


Lagrange polynomials the polynomial 
approximation is 


P(x) = (x — 1.21) (x — 1.44) (1.0/0.924) — (x — 1.0) (x — 1.44) (1.1/0.483) 
+ (x — 1.0) (x — 1.21) (1.2/0,1012); 


P(x) = (x — 1.21) (x — 1.44) « 10.8225 — (x — 1.0) (x — 1.44) + 22.7743 
+ (x — 1.0) (x — 1.21) - 11.8577. 


In this form the polynomial can already be used for numerical calculations. Multiplying out the 
brackets and collecting together terms in like powers of x one finds again the polynomial determined 
earlier P(x) = 0.4099 +- 0.6842x — 0.0941.x?. 


Newton’s interpolation polynomial. If a polynomial of degree 1 passing through the basic points 
(Xo, ¥o)> (X15 ¥1)s +++» Xn» Yn) has already been determined with the help of Lagrange’s formula, 
and a further basic point (%n41, ¥n41) is added, then if one wishes to determine a polynomial ap- 
proximation of degree m+ 1 passing through all nm + 2 basic points, in applying the Lagrange 
formula the whole calculation begins anew. All the Lagrange polynomials Lo(x), £L1(%), ..., La(x) 
must be calculated again. For Newton’s interpolation method, on the other hand, in this case only 
one extra term has to be added.. The method starts by assuming the form 


P(x) = bo + by(x — Xo) + b2(x — Xo)  — X11) $0 + Onl — x0) (x — X11) 0+ (% — Xn_1) 


628 28. Calculus of errors, adjustment of data, approximation theory 


for the polynomial approximation. The coefficients b 9, b,,..-, 5, are again determined in such a 
way that the polynomial passes through the points (xo, Yo), (%1, 1), ---s (Xny Yn)- Substituting the 
values x9, X1, ---, X, for x in Newton’s formula one obtains the stepped system of equations 


Yo = bo 
V1 = bgp + By (x1 — Xo) 
y2 aa bo + by(x2 — Xo) + b2(x2 — Xo) (x2 — X;) 


Yn = by + by(Xn — Xo) + b2Xn — Xo) Kn — 1) He F Ban — 1) 1 On — Xn): 
This system can be solved step-by-step for bo, b,, ...,56,. Using the so-called divided differences 


(see Chapter 29.2.) one can, beginning with bp = yo, give a formula for each coefficient 5,. 
From the second, third, and following equations one finds step-by-step: 


by = (1 — Yo)/(%1 — Xo) = [41X03 

Y2 = Yo + [%1%0] (x2 — X0) + 52(%2 — Xo) (X2 — 1) 
or 

[x2X0] = [X1%0] + 52(x2 — x1) 

and finally 

bz = [X2X1X0],---. Og = [ee Xe-1 ++ X1X0)- 
ee these coefficients in the assumed formula one obtains Newton’s interpolation poly- 
nomial. 


If a further basic point (x,41, ¥ny1) 1S to be taken into account, then one simply introduces into the 
polynomial already calculated the term [ys iXnXn_1 °° X2X1X0] (x — Xo) (x — X14) --- (x — x,) and 
thus obtains a polynomial of degree n + 1 that passes through all the points (xo, Yo), (%1, ¥1), ---s 
(Xn> Yn), (Xn41 ) Yn+1)- 

The decreasing divided differences (singly underlined in 29.2.) are used in Newton’s formula. 
In deriving it, however, it is not essential that the basic points are taken in the order xo, X;, X2, ---, Xn- 
If they are arranged in an arbitrary order x;,, X;,, ---» Xi, and the procedure described is applied, 
then Newton’s interpolation polynomial in the general form is obtained. 


Arranging the basic points in the order x,, X,_1, ---, X1, Xo, one obtains 


y = f(x) mw P,(x) = Yat (x — Xn) [Xn-1*n] Ae Xn) (x — Xn-1) [Xn-2¥n-1*n] aia 
=e (x a Xn) (x — Xn-1) p= (x = x1) [xox1 —* Xn]. 


This formula uses the increasing divided differences (doubly underlined in the scheme). The formula 
can also be rearranged in such a way that the central divided differences (in the middle of the scheme) 
are used to form the polynomial approximation. Whatever form one chooses for the representation, 
one always obtains the uniquely determined polynomial of degree n that passes through the points 
(Xo; Yo)s (x4 ) yi), ie) (Xs Yn). 


Example: To determine the parabolic approximation for the function y = x passing through 
the points (x) = 1, ¥o = 1), (x, = 1.21, y, = 1.1), (x2 = 1.44, y2 = 1.2) by means of Newton's 
interpolation method, one first calculates the divided differences. 


When decreasing divided differences are used, Newton’s interpolation polynomial is 
P(x) = 1 + (x — 1): 0.476190 — (x — 1) (x — 1.21) - 0.0941: 


28.3. Approximation theory 629 


when increasing divided differences are used, it is 
P(x) = 1.2 + (x — 1.44) - 0.434782 — (x — 1.44) (x — 1.21) - 0.0941 
or with central divided differences 
P2(x) = 1.1 + (x — 1.21) - 0.434782 — (x — 1.21) (x — 1.44) - 0.0941. : 
Collecting together terms containing like powers of x, all three cases lead to the previously deter- 
mined polynomial 
P,(x) = 0.4099 + 0.6842x — 0.0941x?. 


Equidistant basic points. If the basic points xo, x;, ..., X, are equidistant (with spacing h), then 
in Newton’s interpolation polynomial 


Pr(x) = Yo + (* — Xo) [X1X0] + +++ + (= X09) = 1) 2 Oe — Xn) enka 2? X1X0); 


which passes through the points (xo, Yo), ---; (Xn, ¥n), the divided differences can be expressed in 
terms of the simple differences if one puts x = xo + th (see Chapter 29.2.). In the scheme in- 


If one also takes account of basic points x_,; = x9 — h, x_2 = Xo — 2h, ..., X_, = Xo — nh and 
determines the Newton interpolation polynomial through the points (x9, Yo), (X_1, ¥-1), «++» (X_ns Y_n)> 
one obtains 


P(x) = Yo + (x — Xo) [xox_1] + (« — X0) (x — x_1) [xox_1x_2] + 
(4 Xo) (C= ea) a eS X_na1)* [xox_1 --- X_a]- 


In this polynomial too the divided differences can be replaced by simple differences. 


Finally, the basic points can be arranged in the alternating sequence xo, x1, X_1, X2,X-2,%3,- 
The corresponding Newton interpolation polynomial is 


P(x) = Yo + (x — Xo) [x1X0] + (x — Xo) (& — X41) Pei X0x_1] 
+ (x — Xo) (x — x1) (x — X14) [62%1X0x_1] + °°. 


According as the number of basic points available is even (7 = 2k) or odd (n = 2k + 1), the poly- 
nomial ends with the term 


(x — Xo) (x — x1) (e — x4). CH — Xu) Deg rei -- X0 Xk] 
or (x — Xo) (% — X4) (X — X13) «--  — Xq) Deere +++ XO + Xe Xd - 


If the divided differences are replaced by the ordinary differences, one obtains Gauss’s interpolation 
formula. 


This uses the differences that stand near the middle of the difference scheme. There is a second 
Gaussian formula in which the basic points are arranged in the order xo, x_1, 1, X_2,%2;-:> 
STIRLING, LAPLACE, BESSEL, EVERETT and others have given further interpolation formulae, obtained 
by various choices of the basic points and by suitable combinations of the formulae of Newton 
and Gauss. 


630 29. Numerical analysis 


Interpolation in tables 


If the basic points xo, x,,.--,X, are arranged in order of magnitude x9 < x, <---< X,, 
and if the function y = f(x) is replaced in the interval x9... x, by an interpolation polynomial 
P,(x) of degree n that passes through the points (x0, Yo), (X1, ¥1), ---> (Xn» Yn), then the approximate 
error is determined by the remainder term R,,,;. The quantity & is, in general, an unknown value 


in the interval (xo, x,). For example, if one determines by linear interpolation the function value 
y = f(x) for an argument x lying between two tabulated values x9 and x9 + A (h the spacing of 
the table), then the interpolation error is given by R2(x) = (x — Xo) (x — Xo — A) y’(§)/2. The 
product (x — xo) (x — Xo — A) has its greatest absolute value for x = x9 + A/2. The interpolation 
error can therefore be estimated by |R2(x)| < (47/8) |y’’(x)|, in which the maximum value of | y’’(x)| 
in the interval (xo, x;) is to be substituted. By interpolation in a k-figure table the interpolation 
error should not exceed a half unit of the Ath place. Hence: 


In a k-figure table of spacing h a linear interpolation is permissible if (h*/8) | y’(x)| < 0.5 + 10-*. 


Example: A five-figure table for lg sin x (0° < x < 45°) has s spacing h =0.01°. Because 
y’(x) = —M/sin? x, where M = lg e, x must satisfy the erst (h*/8) (M/sin? x) < 0.5 - 10-°, 
where / is in radians. From this it follows that sin x > 0.01819. This condition is satisfied for 

x > 1.04°. Consequently one can interpolate linearly in this table for x > 1°. 


If linear interpolation in a table is not permissible, then interpolation formulae of higher order 
must be applied. In these one can keep the interpolation error determined by means of the remainder 
term small if one chooses the basic points in such a way that the interpolation takes place somewhere 
near the middle of the region encompassed by the basic points. This is achieved with interpolation 
formulae that work with central differences, such as Gauss’s interpolation formulae. Only if one 
has to interpolate at the beginning or at the end of a table, where the central differences are not 
available, then one falls back on Newton’s interpolation formula with decreasing or increasing 
differences. 


29. Numerical analysis 


29.1. Introduction ................000- 630 29.5. Numerical methods for the solution 
29.2. Interpolation and calculus of diffe- of linear equations and inequalities 644 
FENCES cc ctu sense cegeceeeanscs 633 Linear equations ........ 00 cc eens 644 
29.3. Numerical models for integration Iteration process for the solution of 
and differentiation ............... 635 systems of linear equations ....... 646 
Numerical solution of ordinary diffe- Linear inequalities ............... 648 
rential Equations .........0 cee eeee 637 29.6. Nomographical procedures ....... 649 
Determination of roots ........... 638 Nomograms for two variables ...... 649 
29.4. Search for extreme values ........ 640 Nomograms for three variables .... 650 
One-dimensional processes ........ 640 Nomograms for more than _ three 
Multi-dimensional search processes POFIQDIES 0:66 0e Ce be eee wen essa 651 
and systems of non-linear equations 642 29.7. Monte-Carlo methods ........... 652 


Mathematics in the strict sense uses the continuum of the real and complex numbers for its 
quantitative statements and also relations between numbers or objects such as vectors or matrices 
that are based on this number system. Jn numerical mathematics all statements must be arrived at 
with the aid of rational numbers and as a rule, for example, when a digital computer is used, with 
only finitely many. In numerical analysis the formulation of a procedure for the solution of a 
mathematical problem therefore requires the setting up of a model, and this gives rise to rounding 
errors and errors of procedure. 


29.1. Introduction 


Rounding errors arise in the mapping of the real numbers that occur in a given procedure 
into the domain of permissible rational numbers. In doing this, not only the initial data of the 
problem, but also the intermediate results after every step in the calculation are falsified. Errors 


29.1. Introduction 631 


of procedure occur because every transcendental operation must be replaced by a finite chain of 
realizable operations, such as addition, subtraction, multiplication and division. The model procedure 
is numerically stable if its quantitative result differs from that of the exact procedure by only a 
specified small amount. 


Estimation of accuracy of numerical procedures. The estimation of the accuracy of numerical 
procedure is a very important practical problem, but is not easy to solve. Its solution depends very 
much on the type of problem in question. In applications, above all two classes of problems can 
be distinguished: approximation of a mathematical object that can be described by means of a 
formula, and approximation of a mathematical object that is known to exist, but can be determined 
only approximately by means of measurements. 


The problem of accuracy for mathematical objects given by formulae. Numbers, vectors, functions, 
functionals and operators are mathematical objects given by formulae. Numbers, vectors or 
functions are regarded as points of a space and approximated by sequences or, what amounts 

[e @) 


to the same thing, by series x = )’ c,;e; with coordinate elements e,; and components c;. 
i=0 
In numerical analysis every expansion of this type must be broken off after finitely Many approxi- 
mating steps, that is, instead of x one is satisfied with the approximation x* = z c,e;. To this there 


belong two quality criteriaQ, and Q,:Q, = 7 is a measure of the extent of the expansion, and Q, 
a measure of the closeness of the approximation, for example, |x — x*| for numbers or sup |x(t) —x*(t)| 
for functions. 

One is interested in keeping Q, as well as Q, small. The smallness of Q,, however, contradicts 
that of Q2, and vice versa. If the coefficients c; are calculated by means of a definite rule, for example, 
by the Taylor series expansion, then for a given Q; = n there is no room to play with for Q,, but 
Q, is fixed by the element x to be approximated. Q,, however, is not known, and one must therefore 
be satisfied in practice with a more or less accurate estimate for Q,, for example, with estimates 
for the remainder term in series expansions. 

More favourable is the situation in which c; is not obtained by a definite rule, but is determined 
for a givenQ, = nin such a way that Q, becomes minimal. A rule for this determination represents 
an optimal expansion process. In this case one usually finds the rule that Q> ,;,(7) is a monotonic 
decreasing function. 

If one considers, for example, orthogonal series expansions, whose coordinate elements e,; with 
a suitable moment operation M satisfy the orthogonality requirement M(e,e,;) = 0 for i + j, and 
if one chooses Q, = M[(x — x*) (x — x*)] as a measure of the approximation, then it is required 
to find values of c, that guarantee a minimum of Q, for a given n. For this one obtains 


n 
= M(xe,)/M(e,e;) and the minimal value is given by Q> min = M(xx) — 3% M(xe;)?/M(e,e;). 
i=l 


Obviously Q> min iS a Monotonic decreasing function of n. For numerical series one takes the moment 
operation M(xy) = x: y, for vector series the scalar product M(xy) = (x, y), for random variables the 
b 


expectation operation, and for functions the scalar product M(xy) = f x(t) y(t) p(t) dt in the function 


a 
space. In the case of functionals or operations given by formulae, the estimation of accuracy is 
made difficult by the fact that an approximation to the given operation is required that produces 
almost the same effect as that operation for a comprehensive set of original elements y. 
If for the given operation (functional or operator) the relation x = Fy holds, then one obtains 
for the approximating operation F* the corresponding relation x* = F*y. The usual procedure 


n 
is the approximation of F by a linear combination F* =~ c,f, of certain linear basic operations f;. 


One can proceed here in a similar way as for the expansion of elements x; the difficulty lies solely 
in the fact that one has to eliminate the dependence on y in a suitable manner. This can be achieved 
by an additional averaging over y or by the method of moments, where now sup M((x — x*)?) 


is made a minimum. However, one comes up against the same difficulties here as "in "the Chebyshev 
approximation, that is, the approximation of functions by uniformly convergent series. 


The accuracy problem for given mathematical objects that have to be approximated by measurements- 
Measurements necessarily always provide incomplete information about the objects to be deter- 
mined. One must attempt to produce from the measurements the best possible approximation of 
the mathematical object. This is often the case for problems involving a search for extreme values, 
or for the problem of approximating an operation F over the measurements x, for a certain set of 
original elements y,. 


632 29. Numerical analysis 


In these problems two phases must be distinguished, the learning phase and the execution phase. 
In the learning phase measurements, which are subject to a criterion of effort, are carried out to a 
certain extent. From the results of these measurements one has to decide, in the best possible manner, 
an approximation to the mathematical object, usually by means of an estimation of parameters. 
The expectation operation M then converts the measured values obtained into estimates c¥ of the 
parameters c;. In addition to the measure of effort Q,; and the measure of the approximation Q, , 
formulated with the operation M, there now enters a measure Q;3, which estimates the randomness 
of the c¥ on the basis of the choice of tests that have led to the measured values. 

In this connection one speaks of the design of experiments if one endeavours to carry out the 
trial tests in such a way that the random errors introduced by the estimation of c¥ are as small as 
possible. The execution phase has the character of an extrapolation. Here the model of the mathe- 
matical object with the estimated values c¥ is applied under arbitrary admissible conditions. If strong 
deviations occur from the Q, min that was obtained in the learning phase, then one can try by sub- 
sequent learning phases to improve step-by-step the model so far achieved. Such a procedure is of 
special significance for statistical methods of numerical analysis, for example, for regression analysis. 


Representation of numbers. In a positional system with base gq > 0 a real number z is represented 

in the form 

z= +(aq* + aq_sq*' +--+ agg® + a_1q7!' + a_2.977 + --), 
where each of the numbers a; is one of the non-negative integers 0, 1, ....q — 1, and in which the 
integerpart with exponents / > 0 of q is distinguished from the fractional part with | < 0. In digital 
computers only numbers of word length L can be represented, with negative powers up to the maximum 
order L. For floating point representation the normal form is used, 

z= +q*%(b_1q™ + b_2q°? +--+ b_1q™*) with b_; + 0. 
With gq = 10, for example, the number — 36.12 is represented as 

—107(3- 10-1! + 6- 10-7 + 1- 10-3 4+ 2- 10-*) = —10? - 0.3612. 
In this the exponent e varies, as a rule, between —L and +L and is also represented in the posi- 
tional system with the basis g. With automatic calculating machines negative exponents can be 
avoided by using, instead of the external number in normed form, the internal number 
z’ = +q°t# (b_,q7! + --- + b_,q™"), whose exponent is too high by an amount L. For a given e, 
an equidistant grid of 2-(g! — 1) numbers can thus be realized; its grid distance is g°-+. The 
totality of all realizable numbers is given by —L < e< +L. The basic arithmetical operations may 
lead to a departure from the region of admissible numbers. Once the basis gq has been agreed, it is 
sufficient to state the sequence of numbers z = +4,4,_,; ... dg -@_,@_2...a_, or the sequence 
of numbers consisting of the exponent e and those of the normed fractional part. 


Example: The decimal number 132[10] is converted by division by 2[10]: 132/2 = 66 + 0/2 
with bo = 0; 66/2 = 33 + 0/2 with b, = 0; 33/2 = 16 + 1/2 with b, = 1; 16/2 = 8 + 0/2 with 
b; = 0; 8/2=4+0/2 with b, =0; 4/2 =2-+4 0/2 with b, = 0; 2/2 = 1+ 0/2 with bs =0 
and 1/2 = 0+ 1/2 with 6, = 1, so that for 132 one obtains the binary number 10000100. To 
obtain the decimal number again from this by conversion one divides by 10[2] = 1010. This 
gives = 10000100/1010 = 1101 + 10/1010 with 59 = 10[2] = 2[10]; 


1101 1101/1010 = 1+ 11/1010 with 5, = 11[2] = 3[10]; 
—1010 —1010 
1100 11 1/1010 = 0+ 1/1010 with 5, = 1[2] = 1[10], 
—1010 


10 that is, b2 - 107 + 6, - 10+ bo: 10° = 132. 


For a real number z simple recursive procedures give the binary representation of its integer 
part int (z) and of its fractional part frac (z). Assume that 
int (z) = ry 2* + iy 21 + +724 0 
with r; = 0 or r; = 1. Write w for int (z). Then one obtains ro = w — 2- int (w/2). Now write w for 
int (w/2). With this “‘new" value w one obtains r, = w — 2- int (w/2). Repeating this renaming and 
calculation procedure recursively gives th sequences rp, rj), ..-., "- 
The fractional part can be written as 
frac (z) = s,/2 + 52/2? + «++ + s,/2* + rem,.,,/2**! 
with s, = 0 or s, = 1 and a (k + 1)st remainder rem,,;. 
Again write u for frac(z). Then one obtains s,; = int (2- wu). Now write u for 2-u — int (2° wu). 
With this “new” value u one obtains sy = int(2-u). Repeating this procedure recursively gives 
the sequence 5, , 52, -.-, 54. 


29.2. Interpolation and calculus of differences 633 


Besides the decimal system with q = 10, other positional systems are used. The binary system 
with g = 2 has the advantage for computers that only two physical states, which are denoted by 0 
and 1, are required for the representation. By conversion it is possible to change a number 
z[q] = a,g* + a,_:q*-1 +++ aoq° from the q-representation into the p-representation 
z[p] = b,p' + b)_1p'~! +++ + bop®, on dividing z[q] by p[q]. One can see from z[p] that this gives 
an integer g, and a fractional part r,/p[q], to which the value bop® corresponds in the representation 
z[p], so that r,; = bo. Similarly from g,/p[q] one obtains the coefficients b,, and then b, b3, ..., by. 


Interval calculus. In order to determine the in- 


accuracies in the results of the basic arithme-  - a: Ss t eS ee AP 

tical operations when the initial dataare rounded ~¢ -¢ c da b 

off, Moore developed the. interval calculus; 

every number z is replaced by the smallest closed 1;-Z>2 ao 6b 2 tZ, 
— aa a 


rational interval [a, 6] in which it must lie. As a ae b-c aC bed 
generalization of the Openanons: “20 numbers, 29.1-1 Intervals for z, + Z,, for —z, and for 
which in every possible case can be visualized 7 _ ;, 

on the number line, one obtains the arithme- 

tical operations on intervals. \f z, = [a,b] and z2 = [c,d], then z; + z2 = [a + c,b + d] and, 
since —[c, d] = [—d, —c], 21 — 22 = [a — d, b — c] (Fig.). For operations of the second kind 
one obtains 

Z,°2Z2 = [a,b]: [c, d] = [min (ac, ad, bc, bd), max (ac, ad, bc, bd)] 

and z,/z2 = [a, 5)/[c, d] = [min (a/c, a/d, b/c, b/d), max (a/c, a/d, b/c, b/d)]. In this way interval 
functions can be defined, in particular, rational interval functions, which must replace other 


unction f(x) = x* with positive integral exponent k is defined as the 
1 itself; thus, one obtains for x = [x,, x2] as interval for the power func- 
l xis xix, x4? x%, vaey X4), max (xf, xt" 'x2, ee x4)). 


29.2. Interpolation and calculus of differences 


The basic idea of interpolation is to replace a function f(x), of which the values y,; = f(x;) are 
given at finitely many points x,, x2, .--, X, or possibly the derivatives y) = fY(x,) up to the order 
m; > j, by an approximation consisting of a superposition a A p(x) = f*(x) = f(x) of standard 


J 

functions y,(x). The form of the functions (x) for a given class of functions and the superposition 
coefficients A; must be uniquely determined from the given values in such a way that the replacement 
function f*(x) assumes the values y, or y{), i= 1,...,, at the given points {x,}. For points other 
than those of the set {x,} the goodness of the interpolation formula is given by an estimation of the 
remainder R = f(x) — f*(x). If x lies in the interior of the smallest interval that contains the set 
{x;}, i= 1,...,”, then one speaks of a proper interpolation, and if x lies outside this interval, 
of an extrapolation. 

Many numerical procedures can be carried out more readily by the superposition of standard 
functions, for example, the determination of roots, integration, differentiation, or integration of 
differential equations. As standard functions polynomials are often used. The Taylor and Lagrange 
interpolations are two limiting cases of practical importance. 


Taylor interpolation. The value of the function f(x,) and the values of the derivatives f/(x,) 
with m, >j are given at only one point x,;. The standard polynomials are p(x) = (x — x, ii! 
for j= 0,1,..-,m, and the superposition coefficients are Ay = f(x). The remainder is R = 
fom DE)» xe — x,)™+41/(m, + 1)!, where & is a point between x, and x (see Chapter 21.). 


: e Xe fF pl .. + a 1) _ " le ey | VOET DOT. iti iT oO a tr = | ul “tion n. ' Al I ‘1 | 40 at r 4 ) c iT S15 ' tat In té 

Pe See a ee Pe ae ae ee Ps ag ea ee alt ee oer 

IT any terms of its T ¥vilor series expansion sin. :e— x7 /4! - x s)—. ae Ses ‘o. i 

a es ae i ee eel ee ee ie a Pama ae ee ki ok: Pe. et a". ae ity a: ke Le) ge yt aS x! re ieee: AG 

Lagrangian interpolation. At the points x, X2, -+-) Xn only the values of the function y,; = f(%;) 
®,(x) r 


are given. The standard polynomials are 9,(x) = =) with ®,(x) = J] (x — x;), and 
cee | n\Aj f=1 


(nye) 
the superposition coefficients are A; = yj =f(x;). The remainder is R= f me @,(x), where & 


is a point in the smallest interval that contains the set {x,}, i= 1,..., 2. 


Divided differences. If for a function y = f(x) the values at the n + 1 basic points Xo, X1, +++» Xn> 
Yo =f (Xo), «+> Yn =f (Xn) are given, then divided differences — also called gradients - of orders 0 


634 29. Numerical analysis 


to n can be determined: 


0. [xo] = Yo. bal= ++ [xn] = Yn 

l. [x,x)] = (y¥, — yx — x)), for example, [x1¥0)] =O1 — Yo)/(x1 — xo), 
[XnXn_1] = Yn — Yn v/a — Xn- 3 

2. [epxyxx] = (Prix) — Deyxn))/(xi — x4), for example, 


[x2*1%0] = ([x2*1] — [41 %0])/(x2 — x0); 
(r+ 1). DeixtpeyXeg Xx,] a ([xixjXe, fia Xxp_4] — DPeyrny X¢)) [xi —_x 


All divided differences are symmetric in their arguments; for example, 
xix) =O1- yl — x) = OF — yxy — x;) = [xx], 
[xx X,] = ([x,x,] — [xjx))/Or% — X;) = [xxx], 
and similarly it can be shown that 
[x yXq) = Derg xy) = Deyxixg] = Deyxeyxi] = Deexix,). 
One can therefore rearrange the arguments in divided differences in an arbitrary way. For calculating 
divided differences the following gradient scheme is used. 


Scheme for the calculation of divided differences 


Ao | Fo 
[xiXo] 

oe | [x2x1) — [x x0] [x24 Xo) 
[v2x;] 

Xa] V2 


[x32] — [r2x,] [v¥3%2x,] 


[v3x2] 
Xy|¥3 


: [%nXn-1] — [%s-1%e-2] | (%pXn—-1Xn_2] 


Single underlining indicates the value of decreasing divided differences, double underlining those of 
increasing divided differences, and in the middle of the scheme lie the central divided differences. 


Example: y = x*; basic points: x9 = 1, x, = 3, x2 = 4. Gradient scheme: 


Xia2 — X1 =! | | [x41%,] [v2.1 Xo] 


Result: [x9] = 1, [4)%o) = 13, [¢2*)%0] = 8. 


Properties of the divided differences. \f f(x) = f,(x) + f(x) and one denotes the divided differences 

of the functions f,(x) and /2(x) by attaching the suffixes 1 and 2, then 

[vaXend «+= %o) = DeXmat -¥ols + Dwke=t «+ Kole 
For a function f(x) = cf,(x«), where c is a constant, 

ptncd ace Xp] = CL oe Xo) e 
For divided differences an independent expression can be given that does not presuppose the for- 
mation of divided differences of lower order: 
e f(x) 
a ee ——_— i ——.., 
! : . a, (x; — Xo) os (HE Xi) OE X41) 0 (CE — Xn) 
If the function f(x) is n times continuously differentiable in an interval containing the basic points 
Xo, X15, +++) Xn, then its divided differences can be expressed in terms of the derivatives of the func- 
tion: [x,X%n_1 °°* Xo] =f"(S)/n!. Here € denotes a suitable point in the interval containing the basic 
points. It follows from this that all divided differences of the nth order of a polynomial function 
of the mth degree are equal. 


29.3. Numerical models for integration and differentiation 635 


Difference table for equidistant basic points. The scheme for the calculation of divided differences 
becomes particularly simple if the basic points xo, x,, ---, X,, arranged in order of magnitude, are 
chosen to be equidistant. Then for a given spacing of width h, x1 = x9 +h, x2 = x9 + 2h,..., 
Xn = Xo + nh. For the differences of the arguments in the left-hand part of the scheme one obtains 
Xisk — X; = kh. Further, if one introduces 


the first difference yj. — yi = A1yj41)2, 
the second difference 4'y,,, — A1y, = A*¥i41)2, 


the nth difference A" *y,,4 — A" y, = A Vji41)2, 


then a simple relation exists between these or- | 
dinary differences and the divided differences. | 
The adjacent formula holds. ; 


It follows from this that not only all the nth di- 
vided differences, but also all the mth ordinary differences of a polynomial function of degree n 
are equal among themselves. 

If an auxiliary variable ¢ is introduced by the equation x = xg + th and if the basic points 
X_1 = Xo — h, x_2 = Xo — 2h, ... preceding the basic point xg are also taken into account, then one 
obtains the following difference table. 


Difference table 


t = Xo + th| y A} A? A3 At Pals Ae 
BD |) Mae ye A*y_» . 
A*y_3)2 A? y_3/2 
==) 4 os | A*y_; Aty_, . 
Aly_tj2 AP y_1)2 A*y_t)2 
0 | xo Yo A* yo A*yo A® yo 
Alyy )2 A? yi) A yy /2 
1 | x; 1 Ay, Aty, . 
Al y3/2 A*y3/2 


2 | X2 Y2 Ay, 


In calculating the difference table one begins with a table of the basic values, forms the difference 
series for the function values y,;, and thus obtains the first differences. For these one again forms 
the difference series to obtain the second differences, and so on. 

Aitken’s interpolation. The recursive interpolation process due to AITKEN can be applied with 
advantage if the point x at which the function f(x) is to be interpolated is given. By linear inter- 
polation (Fig.) one first determines 4, ,,,; in such a way that Ay i4.(ti41 — xi) = Yi(Xia1 — X) 
+ Visi ~ Xi), that is, by 


1 — 
hy. a = det yi (x; x) 


(Xi41 — Xi) Vier (Xi41 — ) 
By adding a further point x;,2 one arrives at a polynomial A; ;41, 1,2 of 
the second degree for the interpolation and hence to a higher accuracy 
1 h i (x; = x) 
Ay, 141,042 = ————~ det |"? 
rng hist, i422 (Xt42 — Xx) 


(Xi42 — Xi) 

One proceeds in this way as required until the values of successive 
approximations differ only by an amount lying within the limits of ac- 
curacy that are in any case to be expected. 


29.2-1 Aitken’s interpolation; A;,i:41(%i41 — x1) = VAXi4e1 — X) + Vig ile — xd) 


29.3. Numerical models for integration and differentiation 


Numerical integration. By a quadrature 
formula one understands a model with 
values of the given function f(x) or its 


636 29. Numerical analysis 


derivatives at basic points x,, in which the superposition coefficients A,, or the basic points x,, 
are to be determined by the demands on the goodness of the model. In a quadrature formula of 
amplitude type, the basic points are given and the A,,; are to be determined; in one of argument type, 
on the other hand, the A,,; are given and suitable basic points are to be determined. The basic 
points need not belong to the interval of integration. 


Example: A quadrature formula of the third order that provides exact values for polynomials 
up to the third degree is | f(x) dx = [(6 — a)/2] {f(a) + f(b) — [(b — a)/6] [ f(b) — f(a))}. 


Interpolation quadrature formulae are obtained by replacing the function to be integrated in 
the interval of integration [a, b] by a Lagrangian interpolation polynomial of the mth degree with 
the basic points x; = a-+ ih, i=0,1,2,...,n. By the goodness requirement that the quadra- 


ture formula shall be exact for every polynomial of the mth degree, one obtains the model 
n 


b — Q (— 1)" | git] 
n i!(n — i)! t—i 
0 
where ¢["+1] — ¢(¢ — 1) (t — 2)... (t — n). If n = 2m — 1 — d, with d = 0 for odd values of n and 
d = | for even values of n, then the error of the model amounts to 
R, = —M,(b — a)?™*1fCm(E), where a<&<b 


b n 
ff@)dxx J Af(a+ih) with A,;= dt, 
a ix] 


and 


M, = 8.333-10-7,. M2, 3.472-10-*+, M3 1.543 - 10-4, 
M, ~ 5.167:10-7, Ms 2.910-10-7, Me 6.379 - 10779, 
M, = 3.912-10719, Mg 5.133 - 10773. 

The trapezoidal rule and Simpson’s rule are often used. 


Numerical differentiation. As for numerical integration, one interpolates the function f(x) in a 
neighbourhood of the point x9 at which f’(xo) is to be formed by an interpolation polynomial 
P,(x) of degree n, and uses the model f’(xo) © P;(xo). For linear interpolation between the points 
Xo — hand xo + A, for example, one obtains 


f'(xo) © Uf (xo + 4) — f(xo — h)\/(2h). 
Through the Taylor expansion 
f (Xo + h) = f(%0) + f'(%0) A + (0) (A7/2!) + or FFM (Xo) (A"/n!) Hos = e992) F (x0) 


of the function f(x) about the point x9 one obtains the following universally applicable model of 
the differential operator 


= ze (1/h) In (1 + AA) = (1/h) [(hA) — (A4)?/2 + (AA)? /3 — (hA)*/4 + +1; 


where A is the difference operator with Af(x) = [f(x + h) — f(x)]/hA. Truncating this series after 
the nth power of the operator 4A results in a model which for polynomials up to and including 
degree n gives exact values for the derivative; for it turns out to be identical with the model obtained 
by the substitution f’(x9) + P;(xo). 


29.3. Numerical models for integration and differentiation 637 


Example: For P(x) = do + a,x + @zx?, (hA) P(x) = P(x + h) — P(x) = ayh 4+ a3(2xh + h?) 
and (hA)? = a,h -+- a2[2(x + A)A + h?] — ayh — a,(2xh + h*) = 2azh*. Hence the differen- 
tiation operator becomes 


(1/h) (UA) — (AA)?/2] Px) = a, + 2ag% 4+ agh — agh = a, 4- 2azx = P'(x). 
Numerical solution of ordinary differential equations 


Many practical problems require the integration of a differential equation y’ = f(x, y) with a 
continuous function f(x, y), starting from the initial value y(x9) = yo and proceeding in the direction 
of x increasing (or the integration of a system of such equations). With x; = x9 + ih one determines 
approximately the values y; = y(x;). In many numerical methods for the solution the integral 


equation p(x) = yo + AG y(t)) dt equivalent to the differential equation is used. 

Adams’ mod: For the sequence of solutions {y;} the integral equation gives the increment 
Viner —Vi= = fre, y(t)) dt; in this the function f(t, y(t)) is replaced by a Newton’s interpolation 
polynomial of degree 1 and the required integration is performed exactly. 


The Adams’ interpolation formula uses the basic | The Adams’ extrapolation formula uses the basic 
POINTS Xj, Xj_15 +++) Xin POINTS Xp yyy X py ees Kinet 


and with f; = f(x;, y,) and Af; = fi,, — f, one obtains 


Yur —MHh XB, A fir. where By = I, Ji i = hy BA istry 
l where B = B, — B,_;. 

By = | mu 1) (ut 2) ... (ur 1) du; 
O 


for example, B, = 0.5; B, = 0.41: 8, = 0.375: 
Bs = 0.3486; B, = 0.329586. 


To start this process one has to work first of all with interpolation polynomials of lower order, or 


the starting values yo, ¥1,-++.¥_ Must already | the starting values yo, ¥1,-+--»¥n_1 must be 

be known with sufficient accuracy. known. The subsequent value is given by a 
transcendental equation which can be solved by 
iteration. 


Runge-Kutta method. In the integration of the differential equation y’ = f(x, y) the interpolation 
theory or the difference calculus require a finite set {y,} ofstored backward values. The Runge-Kutta 
method, on the other hand, uses only the properties of the continuous function f(x, y); hence it 
requires a smaller store and moreover, apart from providing an independent solution, can also be 
used to calculate starting values for other methods. Furthermore, the spacing h can be changed 
during the course of the calculation, and numerical stability is more readily secured than in the case 
of methods based on the calculus of differences. In the Runge-Kutta method an approximate value 
Ji41 iS calculated for 

x; +h 


View = Vit J f(t, W(t) dt, where Fiz. = yi +k 
and the increment & is ‘seine via the intermediate steps: 


j-1 
ko=0, ky = Afr + ah, + E byk,) for f= 1,2,.47 


and k = 2 gjyk,;. The parameters a,, b,;, and g, contained in these steps are determined Y the good- 
x,+h 
ness cauicd by the method. It is customary to demand that g(h) = J S(t, y(t)) dt — Z gyk; pos- 


sesses for h = 0 vanishing derivatives y(0) = g’(0) = --- = w(0) of as high an order | as possible, 
and / is called the order of the Runge-Kutta hee: 

The Runge-Kutta method is an improvement of the Euler method ¥;,, = y, + Af(xi, yi), which 
has an unfavourable error propagation. 


638 29. Numerical analysis 


Examples of proven Runge-Kutta methods of order 4: 


1, Fis = Yi + (1/6) (ky + 2kz + 2ks + ky), where 
Ky = Af(x,, yi), Kz = Af(x, + A/2, ¥) + &,/2), 
ks = hf(x, + h/2, y, + k2/2), ka = Af, + A, yy + Ks). 
2. Hisar = Wy + (1/3) (A, /2 4+ 3k2/2 + k3/2 + kq/2), where 
ky = Af(x,, ¥:), Kz = Af(x, + h/2, yy, + k,/2), 
ks = hf (x; oa h/2, a k,/2 * k2), ka = hf (x; aE h, yy a k,/2 + k/2). 


Determination of roots 


From the range of values of a function y = f(x) one obtains indications of those arguments x 
for which f(x) = 0. Accordingly if one determines an interpolation polynomial, then its roots are 
approximate values for the required roots. One tries to use these values as a step in the estimation 
by an iterative process. 


Method of false position. Two different points y, = f(x,) and yz = f(x.) are connected by a 
Straight line, which serves as an interpolation polynomial of the first degree (Fig.). From 
(x2 — x)/(xz — %1) = (2 — y)/(¥2 — ¥1) One obtains 
Pi(x) = y = yy (x — X2)/(x, — X2) + y2(x — x1)/(x2 — x1) with the root x’ for P,(x) = 0. 


444 Obed flag ad Pbdee @ ba a deed deg bee Peete da be 
a a i a a emi a Ord 9-4 aoe pepe be snes eb be sital 
Pe | ae Peg aoe pb Be be he pte fmdettd 6 Bee oo pb bee 


29.3-1 Method 
of false position 


29.3-2 Method of false position: fixed 
point method 


If one determines y’ = f(x’) by substitution, then 
iterative processes can be found. According to the 
fixed point method (Fig.) one fixes a pair of va- 
lues x2=x, and y,2=yy as the fixed point. 
Writing x; =x;, ¥: = y¥; and x’ = Xigts Vo =Visrs 
then xj41 = (x,y — xi¥y)/(i — yy). According to 
the secant method one writes x .— x;_1, ¥2— 
Yi-ty X17>%Xi, Yi>y and x’+x,,,; and 


ae obtains xi41 = (WX — Via xD/(i — Yin) = 
29.3-3 Method of false position: secant [xii fd — xfi_)VU (x) — f(q_1)]. This pro- 
method cess (Fig.) converges very rapidly to x*, where 


f(x*) = 0, provided that the following conditions are 
satisfied for f(x): from the lower bound m, of f’(x) and the upper bounds M, of |f’(x)| and M, of 
[f’’(x)| one calculates K = M,M?/(2m}), and the inequalities K|x* — xo|< 1 and K|x* — x,|< 1 
must then hold. 

Example: For the root x* = 2.094 551 481 5423... of the function f(x) = x3 — 2x — 5 one 
obtains by the fixed point method with x, = 2 and x, = 3 the following approximations: 
X2 = 2,058 8235294; x3 = 2.096558 6362; x4 = 2.094.440 519 3; 
Xs = 2.0945576218; xe = 2.0945511399; x, = 2.094551 5006. 


Newton’s method. As a Newton’s interpolation polynomial of the first degree a straight line is 
fixed by a point yo = f(xo) on the graph of the function f(x) and by the gradient yg of the tangent 
at that point. From the equation of the tangent y = P(x) = yo + yo(x — Xo) an estimation x’ 
of the zero can be made. 


29.3. Numerical models for integration and differentiation 639 


Iterative processes can be derived from this. In 
the case of Newton’s method with fixed gradient, [ Newton's method | x’ = xo — yol¥o | 


haa] 
dae ot i 
roe ae! 


29.3-4 Newton’s method with fixed gradient 29.3-5 Newton’s method with variable gradient 


Yo =f’ (xy) is left unchanged and the point x’ > x;,, with y,,, = f(x;,,) is obtained from the point 
Xo > x; with yo > f(x,); this step x;,, = x; — f(x))/f’(x,) Is iteratively repeated (Fig.). In the case 
of Newton’s method with variable gradient, the derivative f’(x;) is calculated afresh at each point 


(x;,f(x,)), so that x;44 = x; — f(x)/f'(x:) (Fig.). 
Example: The kth root of the number a is obtained as a root of the function f(x) = x* — a 
by means of the iteration equation 
Xian =X, — (x — @)/(kxto) = x1 — 1/k) + af(kxf-"*). 
For the square root it becomes .x;,; = (x; + a@/x,)/2. 


Newton’s method converges rapidly if K|x* — x,| < 1 holds for the root x* and K has the value 
described in the method of false position. 


Method of iteration. In general, for the determination 
of a root the equation f(x) = 0 is expressed in the form 
Xi41 = F(x;), which is capable of iteration (Fig.). If 
one chooses the function F(x) = x — cf(x), where c > 0, 
then an optimal value for c can be chosen with reference 
to the lower bound m, and the upper bound M, for 
the derivative f’(x), that is, m, < f'(x)< M,. For 
the derivative F(x) = 1—cf(x) it then follows 
that 1 — cm, > 1 — cf(x) > 1— cM,, that is, 
F’(x) lies within bounds that are narrower, the smaller 


the value of max({l —cm,|, |1—cM,|). For c= 
2/(M, + m,) one obtains |1 — cm,| = |1 — cM,| = 
29.3-6 Start of the iteration process (M, — m,)/(M,; + m,) =a< 1. By the mean value 
Xiga = F(x) theorem of the differential calculus it then follows that 


[F(xi41) _ F(x;)| < |F’(S)| [Xia = x;| <a: bere = Xi] - 


29.3-7 Graphical illustration 
of a divergent iteration 


29.3-8 Graphical illustration 
of a convergent iteration — 


640 29. Numerical analysis 


The smaller the value of a the better the convergence of the iteration process. For the fixed point 
method of false position it is clear that F(x) = [x,f(x) — f(x )VF CO) — f(x7)] and for Newton’s 
method F(x) = x — f(x)/f’(x). The convergence can be improved by the 6?-process of Aitken. In 
this case two normal iteration steps X3i41 = F(x3;) and x3;,2 = F(x3;41) are followed by an Aitken 
step X3i43 = X31 — (X3i41 — X31)7/(X3i42 — 2X3i41 + *3,). Graphical representations illustrate the 
difference between divergence and convergence of the iteration procedure (Fig.). 


Example: The square root of the number a > | is obtained by an iteration process for the 
solution of the equation f(x) = x? — a = 0. In general, in the neighbourhood of the zero, 1 < //a 
= x< a, and hence for f’(x) = 2x, 2< f(x) < 2a. Because m= 2, M = 2a, one obtains 
c = (2a—2)/(2a + 2) = (a — 1)(a + 1). Consequently for 2, x;,; =x; — (x? — 2)/3 is a conver- 
gent iteration function. It gives x, = 4/3 = 1.333; x, = 38/27 = 1.407; x3 = 3092/2187 = 1.412. 


Microcomputers with a graphical screen offer also other possibilities to determine crossing points 
of planar curves. One computes a sufficiently dense sequence of points of the curves and exposes 
them on the screen. This shows at least roughly the crossing points. Then one initiates with the 
help of the cursor or with a mouse a step-by-step process in approaching one of the crossing-points. 
If there is reached a crossing point, one can compute a refinement of the curves in the neighbourhood 
of this point and repeat the procedure up to some wanted accuracy. 


29.4. Search for extreme values 


One-dimensional processes 


If the mode of operation of a system depends on the parameters x, , x2, ..., X,, then it is desirable 
to have a criterion F(x,, X2,.--, X,) for the goodness of the mode of operation, which is also called 
the objective or cost function. This function of several variables is not, however, always known by 
means of a formula. Its values then have to be determined by experiments with the system. 

A simple case of such a mode of operation is the determination of the extreme values of a func- 
tion f(x) of one variable. The parameters x,, ...,.x, now determine the points x; whose function 
values f(x;) provide information about the proximity of an extreme value of the function f(x). 
In the search for these points several strategies have proved practicable. 


Overall strategies. If the function f(x) in question has one minimum in the interval [a, 5], then 
the function — f(x) has precisely one maximum; if it has several relative maxima, then every one 
of the strategies described leads to an approximation for one of these values. It may therefore be 
assumed that f(x) has exactly one maximum, and that after the transformation u = (x — a)/(6—a), 
the interval is [0, 1]. An overall strategy Z, = (x,, x2, ..-, X,) then consists in the choice of 7 dif- 
ferent points x; of the interval [0, 1], with x; << x, for i<j. If f(x,) for x; = x, is the greatest of the 
calculated values of the function, then the argument of the maximum lies in the interval [x,_1, X41] 
(Fig.). The indeterminacy of the interval Ly = X41 — X,_,1 leads to the measure of indeterminacy 
of the strategy Ly, = max (Xj41 — Xi-1), where Xo = Oand x,,, = 1. The smaller the largest interval, 


l<isin 


the better the strategy. The optimum measure of indeterminacy L,.,, = min { max (xj41 — Xi-1)} 
is therefore the characteristic of the minimax strategy. Zn 1<i<n 


29.4-1 Measure of indeterminacy of a strategy; 29.4-2 Indeterminacy interval for n = 2 
/(x,) is the greatest calculated value, max the maximum with possible positions of required maximum 


For n = I, [0, 1] is the maximum interval of indeterminacy. For n = 2 (Fig.), [0, x2] and [x,, 1] 
are narrower intervals of indeterminacy. If in accordance with the minimax strategy one attempts 
to make the interval lengths x, and 1 — x, as small as possible, but because x, + x, to avoid the 


29.4. Search for extreme values 641 


unrealizable value x, = x, = 0.5, then the e-optimal overall strategy for n = 2 gives the values 
x; = 0.5 — e/2 and x2 = 0.5 + e/2 with a sufficiently small e > 0. Here the choice of ¢ also depends 
on the error variation of the function values f(x), since it is not possible to determine of two values 
f(x) and f(x + «) differing by less than the width of the variation which one is the greater. 

For n = 3 the new third point can at most increase the sharpness of the separation. A narrowing 
of the indeterminacy interval can only be achieved by a pair of new points. Optimal is an arrangement 
of equidistant pairs which, for even 7, is given by the points x, = (1 + €)- [(K + 1)/2]/{(n/2) + 1} 
— {[(K + 1)/2] — [k/2]} e, where [x] denotes the largest integer less than or equal to x. The ‘length 
of the optimal uncertainty interval is then Ly op, = (1 + €)/{(m/2) + 1}. 


Example: For n= 4 one obtains the partition points x, = 1/3 — 2e/3; x2 = 1/3 + e/3; 
x3 = 2/3 — €/3; x4 = 2/3 + 2e/3 and the optimal uncertainty interval L4,,, = 1/3 + e/3. 
Sequential strategies. As the name implies, every new step in this strategy starts from the preceding 
one, so that the uncertainty interval obtained from that step is made the new interval to be examined. 
In this way one avoids too many partition points. 
In the case of the dichotomic sequential search, the e-optimal overall strategy for n = 2 is applied 
repeatedly. One generalizes L2 4,, = (1 + €)/2 to the recurrence relation 


c2xopt ma (L2(K~1) opt aes €)/2 
and obtains for n = 2k points, L,.,, = 2~"/? + e(1 — 2-"/?). One can see that for the same n the 
length of the interval is less than for the optimal minimax overall strategy. 


Example: The calculation of the first 12 partition points in the search for the minimum of the 
function f(x) = |x? — 2| with e = 10-* shows the effort required: x, = 1 — e/2; x2 = 1 + e/2; 
Xy = 1.5 — 3/4; xg = 1.5 + €/4; xg = 1.25 — Se/8; xg = 1.25 + 3e/8; xq = 1.375 — 1le/16; 
Xg = 1.375 + 5e/16; xo = 1.4375 — 23e/32; x19 = 1.4375 + 9e/32; x1, = 1.406 25 — 45/64; 
X12 = 1.40625 + 19/64. 


The Fibonacci search procedure. The number of tests that are to be carried out during the search 
is fixed. Starting from the initial search interval [a,, b,] the subsequent search intervals are fixed 
by means of a sequence of numbers d;, which are determined from the Fibonacci numbers. The 
Fibonacci numbers Fo = 1, F, = 1, F, = 2, F3 = 3, Fa = 5, Fs = 8, Fg = 13, F7 = 21, Fg = 34, 
Fy = 55, Frio = 89, Fi, = 144, F,2 = 233, F,3 = 377, F,4 = 610 satisfy the recurrence relation 
F, = Fi_1 + Fi_z. This gives 1 = F,_,/F; + F,_2/F,, and because F;_, > F;_2 it follows that 
F,_2/F,; << 1/2. One puts Ly =b,;—a,, dy = Ly = LyFy_1/Fy, dz = L3 = Ly Fy_2/F,, d3 = 
L2Fy_3/Fa_1 = L,F,-3/Fr, dq = L3Fy_4/Fr_2 = Ly Fy_4/Fra,---» dao = Ly = Ly_2Fi/F3 = + = 
L,/F,. Since F, > 2"/? for n > 3, the length of the interval L, is less than that for the dichotomic 
search for the same n. 

With the help of these values the x, are fixed. From x, = a, + d2, x2 = 6; — d, it follows 
that x. — x, = L, — 2d, >0 or x2 > -x,, since d, < '!/2L,. For the search intervals one has 
[a,,b; — dz] or [a,; + d2,5,] of the same length L, — d, = L,[(F, — Fy_2)/Fn) = Li Fn—1/Fh 
= L,=d,. The point x3; and the new search interval depend on the function values f(x,) 
and f(x2): 
for f(x,) = f(xz) one puts az = a,, bz = x, and | for f(x,) < f(x2) one puts a, = x,,b2 = b, and 
x3> az + dz, where az = 4; <..%3<0X1;<. x2 =)by. X3= 6b, —d3, wherea, <x, << x2< x3<D, =b,. 
Comparison of the function values at the points | Comparison of the function values at the points 
x3 and x, gives two possible new uncertainty | x2 and x3 gives two possible new uncertainty 
intervals [a,, x,], [x3, x2] of length L, = d,. intervals [x,, x3], [x2, 5,] of length L; = d,. 


For the determination of x, one applies to the interval [a , 52] 
the same arguments that have led to the point x;. The sub- 
sequent points up to x, are found similarly. 


Example: If the minimum of the function |x? — 2| = f(x) is 
determined by the Fibonacci search procedure, one obtains for 
the lengths of the uncertainty intervals (to three decimal places) 
L, = 2.000, L; = 1.236, L, = 0.764, L4 = 0.472, Ls = 0.292, 
Le = 0.180, L7 = 0.113, Lg = 0.068, Ly = 0.045, Lio = 0.023 
and the points of subdivision x, = 0.764, x2; = 1.236, 
x3 = 1.528, x4 = 1.708, x5 = 1.416, x6 = 1.348, x7 = 1.461, 
Xg = 1.393, x9 = 1.438, x19 = 1.415 (Fig.). 


29.4-3 Intervals of a Fibonacci search procedure for y = |x? — 2j 


642 29. Numerical analysis 


The golden section search procedure. This procedure is only a little less effective than the Fibonacci 
procedure, but it does not require the number of search steps to be fixed in advance. 

In the search interval [a, b] two points x and x’ are fixed by a parameter t still to be determined. 
From t = (b — a)/(b — x) one obtains x = a/t + b(1 — 1/7) and hence from t = (6 — x)/(b — x’), 
the value x’ = a/t? + B(1 — 1/r?); fora = 0 and b = 1, for example, one obtains x = 2/3, x’ = 8/9. 
A point configuration equivalent to (a, x, x’, , 5) ina reduced search interval depends on the values 
a the function at the points x and x’; for f(x’) >/f(x) one chooses a:= x, xi= = x’ and 

= D1 — 1/t?) + a/r?, and for f(x’) "< f(x), on the other hand, b:= x’, x’:=x, 
x:= b(1 — 1/t) + a/t. The length L of the uncertainty interval then changes for f(x’ > > f(x) by 
L:= L(1 — 1/t?) and for f(x’) < f(x) by L:= L/t. Every point in an interval is regarded as being 
equally likely to be an extremum. The probability of an interval to contain an extremum is thus 
proportional to the length of the interval; for the two intervals it is in the ratio (1 — 1/t?): 1/t. 
But the most favourable chain of decision is the one in which one has to distinguish between two 
equally probable cases. For this one has 1 — 1/t? = 1/z, or the optimal t-value t = 1/2 + (1/2) V5 
a=] 618 033 989... (see Chapter 7.). For the uncertainty intervals the recurrence relation L:= L/t 
holds, so that after n search steps L, = L,/t". The ratio of the effectiveness of this procedure to that 
of the dichotomic search is as t is to /2 © 1.142..., so that it is approximately 14% higher. 

The connection with the Fibonacci procedure follows fromtherelation F, = (1/)/5) [c'*+! — 1/(—r)#+4]. 
For i= 1 and i= 2 one obtains by substitution F,; = 1 and F, = 2. In general, however, the 
recursion formula F,,, = F; + F,_; holds also for the right-hand side of the relation; for if one 
multiplies vitd — 1f(—t)4! = x! — 1f(—7r)' + x! — 1/(—2)! by (— “iit on boths sides, then, 
whether i is even or odd, one obtains the relation t?!(t2 — + — 1) = t? — t — 1, which is always 
correct because t was determined from t2 — tr — 1 = 0. 

The same result is obtained by the z-transformation method for linear recurrence relations. One 


CO 
introduces the auxiliary function F(z) = ~ F,z! _ takes into aecount the fact that from the recur- 


rence relation F, = Fi i+ Fi_2, Bz =2 3 Fi in wt 2? YF, 2z'-2, One then obtains 


F(z) = Fo + Fiz + & F,z' = Fo + Faz — 2Fo es Fo +z pi F,z! sae as F,z! 
or —Fo + 2(Fo — = z*F(z) + zF(z) — F(z), giving 
F(z) = ((Fo — F,) z— — +z—1)= rae For z,,2 = —(1/2) + (1/2) 5 the partial 


fraction expansion is F(z) = —— | — 

21; — Zz 22 — Zz 
as a geometric series in powers of z, and comparison of coefficients with the series F(z) = = F,z! 
gives the relation stated. 


Example: The determination of the minimum of the function f(x) = |x? — 2) by the golden 
section search procedure gives for the initial interval a; = 0, 6, = 2 the search points: x, = 0.764, 
X2 = 1.236, x5 = 1.528, x4 = 1.708, x5 = 1.415, x6 = 1.348, x7 = 1.459, xg = 1.391, x9 = 1.373, 
9 = 1. 399, An) = 1.405 (Fig.). 

Compared with the iteration method x,;,,; = x; — (x? — 2)/3 the golden section procedure 
tends to the required solution noticeably more slowly. However, it is a procedure for more general 
functions, whilst the iteration process is suited for special functions only. 


|: each term of the sum can be expanded 


Multi-dimensional search processes and systems of non-linear equations 


By a generalization of the one-dimensional procedure one attempts to determine the extreme 
values of functions of several variables or the solution of systems of non-linear equations. Owing 
to the amount of computation modern digital computers are necessary for this purpose and the 
effectiveness of the methods is smaller. If in the case of nm = 1 variable 90% has been eliminated, 
so that the indeterminacy interval is only 10% of the original one, then because 0.9 - 0.9 = 0.81 
an indeterminacy of 19% remains for n = 2, for n = 3 it amounts to 27%, because (0.9)? = 0.73, 
and it increases to 34% for n = 4, to 41% for n = 5, 47% for n = 6 and 52% for n = 7. 

In the case n = 1 for the non-linear equation f(x) = 0, one arrives by way of the equivalent 
relation x = x — cf(x) at the iterative process x* = x*-! — cf(x*-!), where k and k — 1 are indices. 
The constant c is determined optimally by a goodness criterion. Generalizing to n = 2, one seeks 
to solve the system of equations 

fi(%1, x2) =0 Xp =X — Cr fil%1, X2) — C12 f2(%1, x2) 
by putting 
f2(X1,X2) =0 X2 = X2 — Ca fi(%1, X2) — C22f2(x1, X2) 
with the non-singular matrix C = (cj) = ( C11 ©12 


). which is chosen according to a goodness 
criterion. 


C21 C22 


29.4. Search for extreme values 643 


In an iterative process not only do the x* and x’ depend on the values of xt~! and x’~! of the 
preceding iterative step, but also the constants c, ;(x* , x$) on c;;(x*—!, x§-1). Their values determine 
the convergence behaviour, which can be improved by multi-step iteration algorithms, in which the 
approximation in the kth iteration step depends on the finitely many preceding approximations. 

For the multi-dimensional search for extreme values, for example, for the maximization of the 


. : oe 0 
function f(x,,x2), there is the necessary condition /fi,(x1, x2) = em f(x1,X2) =O and 


1 
fx(X1,X2) = x f(x,, X2) = 0 in the interior of the domain of definition. By solving this non- 
2 


linear system of equations one obtains not only maxima, but also minima or saddle points. 

In setting up the iteration sequence for the determination of roots, the choice of the matrix C 
should secure not only a rapid convergence, but also the approach to the position of the required 
extremum. 

No generally satisfactory recommendations can yet be given for the choice of the search matrix C. 
In stochastic search procedures the C-matrix is dependent on chance; random improvement steps 
are carried out, but the sequence of these steps should converge with probability 1, that is with 
certainty, to the required point. 


Newton’s method. Generalizing the one-dimensional Newton’s method, in the case nm = 2 one 
replaces the functions f,(x;, x2) and f2(x;, x2) by their tangent planes in the neighbourhood of 
the point determined, and takes as the next approximation the point at which the line of intersection 
of the two planes cuts the plane z = 0. 

With k used again as an index, the tangent planes are clearly 


Ofi(xi*, x8") Afi (xt *, x57") 


z= filxt-', x§-*) + (Xy = xq) (x2 — x§"), 
Ox, 0x2 
k-l Vk-1 -~1 vk-1 

z= fa(xko 3, xb-1) + Ofer x27) ¢, — kod) SOT ET) (, — xk-1), 
Ox, 0x2 


Putting z = 0 and solving the resulting linear system of equations for the factors (x, — xi?) and 
(x2 — x‘-1) one obtains the C-matrix as the inverse of the Jacobian matrix of the system of equations. 


Example: If a solution of the system of equations 
fil%1, X2) = x, + 3]g x, — x2 =0, 
Sx(x1, Xz) = 2x? — x,x2 — 5x, +1=0 
is required, then one determines the intersection of the curves /; = const and f/; = const, and 


obtains approximately the points (1.4, —1.5) and (3.4, 2.2). As initial approximation one chooses 
the point (3.4, 2.2). From the Jacobian matrix 


afi 1 4 3:0.43429 


Ss 


Ox, OX xy hs axs 
Wf a] \ 
ae axa 4x, —x,—5 — x; 


one obtains the C-matrix as its inverse, and hence 
the recursion scheme. From this one obtains in suc- 
cession the adjacent approximations: 


For these values f,(x?, x3) = 0.0002 and /2(x}, x3) = 0.0000. 


The gradient method or method of steepest descent. The direction of the greatest increase of a 
function f(x; , X2) is given by the direction of the gradient (fx,(%1, X2), fx,(%1, X2))- 

If it is required to find the minimum of the function f(x; , x2), then an improvement must result 
if one moves from the point (x‘-1, x4-!) already obtained in the direction opposite to that of the 
gradient. This means making the following attempt for the iteration process: 

xk = xk) — Uf, (ko, xk-1) and xk = x$-1.= If, 1, x51) «~with 12> 0. 
Thus, in this case the C-matrix is taken to be a diagonal matrix with diagonal elements /. 

If for a fixed / a trial step is carried out, after which one examines whether an improvement has 
occurred and if, depending on the success, a new spacing / is chosen, then one speaks of the gradient 
method. 

One can, however, attempt to choose the factor / optimally at every step. A natural requirement 
for / would be to make the value of the function 


Fl) = fit! — ett, x84), 8 — aT x2) 


644 29. Numerical analysis 


as small as possible. The required value for / can, for example, 
be determined by a one-dimensional search process (Fig.). 

If this optimal value for / is chosen at every step, then one 
speaks of the method of steepest descent. 


29.4-4 Gradient method; contour lines f, > f: > fs > fu > Ss 
of f(x1, x2) 


Example: The method will be illustrated by determining the minimum of the function 

f(x, X2) = 2x? — 2x, + x2 — x, = 2(x, — 1/2)? + (x2 — 1/2)? — 3/4: 
in this simple case the minimum x, = x, = 0.5 is known. By the gradient method one obtains the 
recursion formulae 

xh = xim!t — 22x4-! —1) and x& = x8! — K2xk-! — 1). 
According to the method of steepest descent the optimal / is obtained from the equation 

1 = [4(2xf-! — 1)? + (2x87? — 1)7)/(16(2x4-! — 1)? + 2(2x4-! — 1)?). 
If one begins with the initial approximation |/° = 0. xt =0.3 
x? =x} =0, then one obtains in succession the /1—04]4 x? = 0.457 — 0.46] 
adjacent /-values and improved approximations: 2=0.283 x3=0.504 x3 = 0.482 


= 0.429 xt=0497 xf =0.497 


29.5. Numerical methods for the solution of linear equations and inequalities 


Linear equations 


n 
A solution of a system of m linear equations y, = D’ a,,;x, with i = 1, 2, ..., m consists of n num- 
j=} 


bers x, with j = 1, 2, ..., 2. In n-dimensional space each one of these equations with a fixed y, can 
be interpreted as a hyperplane with the normal vector a; = (@;1, 42, ---, in). If for m =n these 


m 
normal vectors are linearly independent, that is, if the equation »” /,a, = 0 is satisfied only when all 
i=l 


the /; are zero, then the m = m hyperplanes have a common point of intersection with the uniquely 
determined coordinates x,, j = 1, 2, ..., n. Ifm< nand the m normal vectors are linearly independent, 
then the corresponding result holds for the m-dimensional subspace determined by them. For 
m > n, the m vectors a; are certainly linearly dependent. If the vector a;, is a linear combination 
of some of the other vectors a,, and if in addition y,, is the same linear combination of the y,, and 
the hyperplane belonging to a;, contains the intersection of the hyperplanes belonging to these a,, 
then the hyperplane belonging to a;, does not represent additional conditions, and its equation need 
not be taken into account. A contradiction occurs if y;, is not the same linear combination of the, 
yj aS a;, is of the a,. In this case the given linear system of equations is insoluble. 

For n = 2 the hyperplanes are straight lines, which intersect, 
are parallel to one another, or coincide (Fig.). 


Jordan’s elimination. The system of equations is arranged in 
the form of a table, so that the rth row contains the coefficients 
a,; of the x, with j = 1, 2, ....2 and the sth column the coefficients 
a;, of x, with i = 1, 2,..., m. 


' 
/ = a 
|4s” CX, + OX, 


29.5-1 Straight lines as hyper- 
planes in the Euclidean plane: 
two lines intersect, coincide or 
are parallel 


29.5. Numerical methods for the solution of linear equations and inequalities 645 


If then one of the coefficients is different from zero, for example, a,, + 0, then x, can be elimi- 
nated using y,. From yr = Gp Xy + Gp2X2 +1) +. Gp5X5 + °°* + GnX, it follows that x, = 
(1/a,,) [—@,1%1 — @,2X2 — *++ +, — ++ — ApnX,). Substitution of this value for x, changes all coef- 
ficients of the table. Those of the sth solute: that is, those of y,, then become a;,/a,., On on 
Ams/Irg- For the remaining 6,; with i+ r and j +s, b,;; = a,; — (ai, -° @,;)/a,,. The table then as- 
sumes the form: 


Ay x Tr Jy, soe Xn 
V1 by +4,,/a,, 1 
y2 boy ba2 + +43,/4,5 os Ban 2 
x2 — 71 /ps —,2/ys shat + 1/d,s aes —Arn| Ops r 
Te bmn b> - *+2/d,. a m 
| s n 


In this way one attempts as far as possible to exchange every x, for a y,. This process comes to an 
end when every coefficient at the intersection of the row of a not yet exchanged y, with the column 
of a remaining x; is zero. The exchanged x, are then linear combinations of the exchanged y, and 
the not exchanged variables x;. These x; do not obey any further conditions and can, as free pa- 
rameters, be chosen arbitrarily. The not exchanged y, are then linear combinations of the exchanged 
variables y, alone. If the prescribed values for y; do not satisfy these conditions, then the system of 
equations has no solution. 

If the process does not come to an end, so that all x, can be exchanged for variables y,, then by 
the final table the x, are unique functions of the exchanged y,. Any not exchanged y, that are still 
present are then unique linear combinations of the y,. If the prescribed values for y,; do not satisfy 
these conditions, then contradictions occur and the system of equations has no solution. 

If for m =n all the x, can be exchanged for the y,, then the final table represents the inverse 
matrix A~! of the matrix A of the original table. 


Example 1: The Jordan elimination applied to the system of 
equations results in tables, with the coefficient a,, + 0 in brackets, 
from whose rows the next elimination step follows. The elimination 
equation is placed in red underneath the table. 


y, Wa 1 3 ee a xy ite 1 og 122 8) 12 
Ze —-| 1 nae x, |—12 Bee ip a. fe —if 172 
% eee —1 =—2 36 | i 2 Vs 1 Ge oe x, | 1/2 —12 0 


This evict of equations has a unique solution. Substitution gives x; = 3, *2 = —3, x3 = —2. 
Example 2: For the system of equations 3 


one obtains similarly 


«| a ae im 
y2 —1/2 @) 1/2 
om 3/4 1 1/4 


a = lt val] 


The missing exchange of the variable x, ist not possible, since the coefficient in brackets is zero. 
Its row gives the condition y. = (y3; — y,)/2 for the existence of a solution. It is not satisfied for 
the given numerical values and the system of equations has no solution. If the given values were 
¥1 = 2, y2 = 3, y3 = 8, then y, = (y3 — y,)/2 would be satisfied and would lead to the solution 
x, = 2.5, x3 = 0.5 + x2, in which the value of x2 can be chosen arbitrarily. 


646 29. Numerical analysis 


The modified system of linear equations Ny de ws XE 
yy=0 = Z aiyx; — b,; leads to another form of >», 411 Qin --» «ay, —&, 1 
the jordan exchange problem. The initial table ” S G21 G22 ++» G2 —b2 5 
has an additional column for the 5,. | , 

Ym Gmi Gm2 dial Onn —De m 


2 s n n+] 


Because after every elimination of x, by y, the coefficients of the sth column have y, = 0 as a 
factor, this column can be discarded. 


Example: For the system 


p43 3p 2 i = 
Ny — Xa Xs — 4 = )2 
Xj — Xz — Xe — FS 


loi | 


Hence the solution is x, = 3, x2. = —3, x3 = —2. 

Gauss’s method. This method of solution is obtained from the Jordan elimination procedure. The 
rows of any of the exchanged variables x, are not again entered in the table, but are separately 
noted. In this way the size of the table is steadily reduced and in the end one obtains a very easily 
soluble linear system of equations with a triangular matrix. 


Example: For the above example one obtains in succession the following tables and equations, 
from which the solution is obtained recursively: 


m=] 


Iteration process for the solution of systems of linear equations 


or x2 = —3, x3 = —2, x, = 3. 


apy the substitution a,; = hy; + cy. the given system of equations p> a; jx; = b, is decomposed into 
x hy jXy + = Ci X; = b,. The h,; are chosen in such a way that ‘hele iateix H = (h,;) has an inverse 


matrix H- = = (h;,1) that is easily formed, for example, as a diagonal matrix. It then follows that 
= Shey bh; — E hate jX;. This system of equations can be written in iterative form. If one 


anes x a= 2 h-} b, as initial value, then one obtains xk = x? — 2 hy icijx—1. This process con- 


verges to the solution of the linear system of equations if and only ‘if the absolute value of all the 
eigenvalues of the matrix J (h7}c,,) = (k:,) is less than 1. Aneigenvalue | of the matrix (k,,) is defined 
i 


to be a number / for which the system of equations Dk, ;x; = /x, has a non-trivial solution, that is, 

one for which not all the x, vanish. This means that either 2 | Ar pes,| << 1 or 2 | x hzoeyy| <1. 
i 

These conditions are satisfied, for example, if the condition Lane zk jl holds for the matrix 


= (k,,;). The increment of successive approximations is then given a 
Of — 1) = 2 hehea lait — 2577). 


29.5. Numerical methods for the solution of linear equations and inequalities 647 


Example: For the solution by the iteration method the equations of the given system are ex- 
pressed in a form suitable for iteration: 


10x,;— x2+ %x3=10 m= Olxd'! -— Ole! + 1 
x+5n+ x4 = 5 —e xh = —0.2x4-1 — 0.21 +41 
X;— X2+ 10x; = = —0.1xj-* + O.1x57* + 1 


For the initial approximation x? = 1, xf = 1, x2 = 1 one then obtains the increments 


0 —0.04\ /0.004\ / 0.0012\ x, 
| —0.4 0 006) 9.0016) x3 
0 —0.04/ \0.004 0.0012/ x3 
and adding these to the initial approximation gives the approximate solution after four iteration 
steps xf = 0.9652, xf = 0.6144, x$ = 0.9652. 
Eigenvalue problems. If one regards the linear system of equations »’ a,,y, = x, for i= 1, 2,...,7 
j=1 


j= 
as a description of a linear system with the y, as input and the x; as output variables, whose cause- 
effect relationship is given by the matrix A = (a,,), then the existence of an eigenvalue / according 
n 


to the equation a, ,x,; = /x; means that an eigenvector (x,,X2, ---,X,) used as an input variable 
j= n 
remains unchanged except for the factor of proportionality /. The eigenvalue equation }' a; jxj = 1x; 


j=1 
is a homogeneous linear system, which has solutions different from zero if the determinant of the 
matrix A — JI, in which J is the unit matrix, vanishes, that is, 


Q44;—T1 ay2 "** Qin 
det |A — iI|= a2} Ay,—1 ‘ss Gan = 0. 
Gn an2 ies Ann = 


This characteristic equation is a polynomial of degree n for the determination of the eigenvalues. 
For each eigenvalue / one obtains from the eigenvalue equation the corresponding eigenvector. 
With eigenvectors a decoupling of the system is possible, that is, it can be achieved that every output 
variable is dependent only on one input variable. In this way normal oscillations can be introduced 
in oscillatory mechanical systems. In the theory of the gyroscope one uses three axes placed in the 
directions of three independent eigenvectors of the matrix of the moment of inertia, in order to 
obtain a simple form of the dynamic equations. The theory of electric n-ports or 2-ports, using 
wave parameters, rests on a representation of the linear transformation that corresponds to A = (a;,) 
solely by means of the eigenvalues and eigenvectors. 

The eigenvalue of greatest absolute value and an associated eigenvector can be obtained by a rule 
that has proved practical in electrical transmission theory. According to this an approximation to 
the eigenvector and an estimation of the eigenvalue of greatest absolute value are obtained from 
the output vector of a chain that begins with an arbitrary input and uses the output vector of one 
step in the chain as the input vector of the next step sufficiently often, in accordance with the 


n 
equations 5» a,,x, = lx; for i= 1,2, ..., m (Fig.). The last output vector represents an eigenvector 
j=1 


approximately, and the quotient of corresponding components of successive approximations repre- 
sents the eigenvalue. With the essentially arbitrary initial vector x? = b,, the iterative process x} = 


n 
> 4,;x*-! is valid. The quotients xf/x}~', which approach equality with one another, serve as an 
j=l 


estimation of the eigenvalue of greatest absolute value. 


29.5-2 Scheme of a chain for the determination of the eigenvalue of greatest absolute value 


Example: For the system of equations x, — x, = lx, 
2x2 + t= lx, 
*2 +x3=k 3 


648 29. Numerical analysis 


by iteration with the initial values x? = 1, xf = 1, xf = 1 one obtains successively 


- 0 —-3 +11 -—2 -—9 -—2311 —608 
a 3 8 21 55 144 377 987 
“2 5 13 34 89 233 610 
es $s 3 4 5 6 7 


For x]/x? one obtains 2.633, 2.62, 2.62 as an approximation for the eigenvalue of greatest absolute 
value. The true eigenvalues are obtained from the matrix A and its characteristic equation: 
1 -—1 O l1—/ —l 0 
A=1|0 2 1 j-e/|0 2—T/ 1 = (1 —/)?(2—N)-—-(1—/)=0 
0 1 1 0 ] 1-—! 
with (1 — /) = Oor/, = land /*? — 3/+2=1or/, = (3 + //5)/2 = 2.6173 and /, = (3 — y5)/2 
ew 0.3825. 


Linear inequalities 
In the application of mathematical methods to economics and planning, n-tuples (x; , x2, ---, Xn) 
have to be determined from systems of inequalities y, = — ¥ Q;,jX; + 6; >O for i= 1, 2, ..., m. 
If the n-tuple is regarded as a point of an n-dimensional nate then every one of the m inequalities 
fixes a half-space bordered by a hyperplane — z a; jx; + 6b; = 0. For m < n the hyperplanes inter- 


sect in subspaces of dimension at least n — m. For the case m > n, which is important in practice, 
the configuration of the intersection contains points that are corner points of the intersection of the 
half-spaces. This intersection is the required solution region of the given inequalities. It is an n-dimen- 
sional convex polyhedron; together with any two points within it or on its boundary it contains 
all the points of the line joining these two points. A finite polyhedron, which does not contain 
infinitely distant points, is completely determined by its corner points, or vertices. 

Under the assumptions that m > n and that the matrix of the coefficients (a,,;) has rank n it can 
be decided by means of an algorithm whether there exist n-tuples that are solutions of the inequalities, 
and if so, how the corner points of the solution region are obtained. 


Initial information 


V1 4311 412 Qin dy 

y2 421 422 G2, bz 

Ym ant QAm2 Amn bm 

xy = —by1y1 — by2ay2 — + — binYn + bi, 

X2 = —ba1y1 — b22y2 — +++ — banYn + 55, 

Xn = —bar¥1 — bn2¥2 — +++ — Ban¥n + b, Ym bmt bmn2 a bran bin 


Of the initial information presented in the table, it can be assumed as a result of the assumptions, 
that the first » rows of the coefficients a,; are linearly independent. By Jordan elimination every 
variable x, can then be exchanged against a variable y,;. One obtains a standard form of the system 
of inequalities consisting of m equations and a table, from which the variables y;, i= 1,2,...,n 
with y; > 0 are to be determined in such a way that for i=n-+ 1,..., m also, y, > 0. The n-tuple 
for the required corner point is then given by the 7 equations. 

If in this table b; > 0, then the point y, = 0, i = 1, ..., 2, leads to a solution. 

If one of the numbers 5; is negative, for example 5; < 0, then the point y, = 0, i= 1, 2,..., n, 
does not satisfy the rth inequality, since y, = b; < 0. If in addition for every coefficient of this 
row b,; > 0,j = 1, 2, ..., a, then there is no point y; > 0, i = 1, 2, ...,m, that satisfies this inequality. 
The given system then has no solution. 

If, however, for b; < 0 there exists a coefficient b,, < 0 in the rth row, then one forms the quotients 
b;/b,,, i=n-+1,...,m, with the coefficients of the sth column. If besides b//b,, there are other 
non-negative quotients, then one chooses the smallest. If this occurs in the igth row, then one chooses 
bi. as exchange element for a Jordan step, which exchanges the variables yi, with y,. From the 
equation y;, = — 2 bi; — bisy, + bj, this gives the value y, = (— P bii¥y — Yig)[Bigs + bi, /Digs - 

i=s be 


29.6. Nomographical procedures 649 


The term 5;,/b;,5 is non-negative. If i9 = r, then by the Jordan step one has achieved that all the 
elements of the new column 1 are non-negative and a corner point has been found. 

If, however, io + r, then the terms of the other rows i + ig have to be estimated. After the Jordan 
step they satisfy —b,, - bj,/bi,s + 6; = bis[(b;/b,,) — (bi,/bi,s)]. For rows i with b;/b,, > bj,/b;,s > 0 
an improvement is achieved in the case 5;, < 0, since this negative term has a smaller absolute 
value after the Jordan step. A positive term 5,, > 0 remains positive. For rows i with b;/b,, < 0, 
in the case b,, < 0 the term 5; is positive and remains positive; in the case b,, > 0, b; was negative, 
remains negative, and its absolute value even increases. 

It can be shown that for the cases b;/b,;, > 0 but b,, < 0, for b;/b,, << 0 but b,, > 0, and for other 
special cases such as bj,/b;,; = 0, a table with only positive elements results after finitely many 
exchange steps, leading to a corner point (x, , x2, ..-, X,) of the solution region. 


Example: To determine one corner point of the solution region of the inequalities 
—xX, + 2x,—3x,;-—220, 4x, —%x.+4x%,-—520, 
—3x, + x,—44,+320, 420, 220, x3>0. 
This system of inequalities is already in standard form. According to the above algorithm one 
obtains successively the tables 


29.6. Nomographical procedures 


Nomograms represent the functional dependence of several variables graphically in such a way 
that the value of one of them can be obtained from the given values of the others by a simple geo- 
metrical construction. 


Nomograms for two variables 


For the functional relationship y = f(x) its graphical representation in a Cartesian coordinate 
system already forms a nomogram, which consists of the two coordinate axes and, in general, a 
curve. Its graph can be changed if along the x- and y-axes one plots not multiples of a unit distance, 
but lengths € = g(x) and 7 = y(y), respectively, which are given by invertible monotonic functions 
gy and y. For a suitable choice of these functions one can obtain graph paper in which the scale 
carrier 1 = g(&) represents the given relation y = p-'g(¢(x)) =f(x), so that f = y~1gq, where 
wy is the inverse function of y. The scale carrier is a straight line ifn = « + B&, or p(y) =a +f9(x). 


Examples: 1. For semi-logarithmic paper, § = x and n = log, x. Functions y = Ka=* have as 
scale carrier a straight line with « = log, K and § = L. 

2. For doubly logarithmic paper, € = log, x and 7 = m log, y. Functions y = Kx" have as 
scale carrier a straight line with « = m log, K and f = mL. 

3. For probability paper, =x and »=F-"'(y), where F-' is the inverse function of the 


Gaussian error function F(w) = ay | exp [—x*/2] dx. The scale carrier is then a straight line 


for the functions »y = F(K + Lx), that is, the distribution functions of all normal distributions. 


Double scales are scale carriers on which, corresponding to the x-values of a sufficient number 
of points, the associated y-values are arranged immediately opposite. One can imagine these value 
scales to be transferred by parallel projection from the x-axis and the y-axis, so that these axes 
need no longer be given (Fig.). One obtains a functional scale or a curved scale of a variable u, if 
every point of a curve marked with a parameter value u is determined in a fixed x, y-system by 


650 29. Numerical analysis 


29.6-1 Double scale for the relation between the area of a circle 


A ‘ oy A = nd*/4 and its diameter d 

; =t 

34 
5> 
6+ 
7 

An75 8 d=3.09 

9 

10 - 


29.6-2 Scale holder with the equation (x — y)/2 = In [(x + y)/2] 


x = y(u) and y = y(u). The curve of this function scale then represents the relation between the 
functions g(u) and p(u). 


Example: For the functions x = g(u)=e" + u and y=y(u) =e" — u, because u= (1/2) In (x + y), 
one obtains the equation (x — y)/2 = In (x + y)/2 of the scale carrier (Fig.). 


Nomograms for three variables 


For a functional relation F(u, v, w) = 0, to be able to read easily the value of one variable from 
those of the two remaining ones, one usually uses collineation nomograms or alignment charts. 


Collineation nomograms. If one regards each of the three variables as a parameter, then by means 
of six functions y,, py; with i = 1, 2, 3 one can find three functional scales in a unique x, y-coordinate 
system given by the equations x; = 9, (u), ¥1 = y1(u); x2 = P2(v), ¥2 = Yoav); x3 = 93(W), y3 = Y3(). 
To facilitate the reading it will additionally be required of the functions y;, y, that value triples 
(Uo, Vo, Wo) belonging together in accordance with the equation F(uo, v9, Wo) = 0 lie on a straight 
line (Fig.), that is, that the triangle with vertices (x, , y;), (2, ¥2), (x3, ¥3) has zero area. The Soreau 
equation is a necessary and sufficient condition for this. 


If one has functions 9,, y,; that satisfy this equation, then __ 
the function scales are determined by x; =; andy,=Y; 

of course not uniquely, since every transformation of the ~~ 
common coordinate plane that transforms straight lines 

again into straight lines gives rise to a new collineation __ 
chart. Such a transformation can be used to improve the | 
size of the variation intervals of the variables and thus the =f 
accuracy of the readings. sere == 


29.6-3 The associated value triple (ug, vg, We) of the collineation nomogram lies on a straight line 


Basic forms and scale equations for collineation nomograms. 


(1) The three points (x,,y1), (x2, 2), (%3, 3) lie on a straight line if (y,; — y2)/(x1 — x2) 
= (y1 — ¥3)/(41 — x3). If one puts x, = 8,(4), vi =A); x2 = —82(v), y2 = —f2(v) and 
x3 = —g3(w), ¥3 = —/f3(w), then no linear relation exists between the /,, g;; three curvilinear scales 
result and the basic form 


[A@) + f2(e))/Le1(@) + g2(v~))] = [1@ + Aw)I/[e1@) + 23()]. 
(2) For x, = 0 the conditional equation (1) becomes 
(v1 — ¥2)(—x2) = (1 — ¥3s)/(— x3) Or = YW, = (¥3X2 — Y2X3)/(x2 — x3). 


29.6. Nomographical procedures 651 


If then the scales S,, S2, S3 are determined by x, = 0, 1 = fi(¥), x2 = —1/22(v), v2 = fo(v)/22(v) 
and x3 = 1/g3(w), y3 = = f3(w)/g3(w), then S, is rectilinear, whereas no linear relations exist between 
fa, 82 and fy, 25. The basic form is f,(u) = [fa(v) + fa()I/Lea(v) + g3(w)]. 

(3) If one substitutes y. = px2 + q into the conditional equation (2), so that x, = 0, y, = f,(u); 
X2 = —1/g22(v), v2 = PX2 + q; x3 = 1/23(w) and y3 = = f3(w)/g3(w), then S, and S, are rectilinear, 
p can be chosen arbitrarily, and only between f3 and g; there is no linear relation; the basic form is 
fi) = [—p + ay2(v) + fa(w))/[e2(v) + 23(w)). 

(4) If through y, = mx3 +c the scale S3 also becomes linear, where m can be freely chosen, 
then under otherwise identical conditions as in (3) the basic form becomes 


f:@) = [—p + qg2(v) + m + cg3(w))/[g2(v) + 23(w)]. 


(5) If x2 = 1 is substituted into the conditional equation (2), it becomes y, = (y3 — y2x3)/(1 — x3) 
or yi(1 — x3) + y2x3 — 3 = 0. If one now introduces: x, = 0, y; = fiw); x2 = 1, v2 = fo(v) 
and x3 = g3(w)/[f/3(w) + g3(w)], y3 = —h3(w)/Lfs(w) + g3(w)], then the scales S, and S> are linear 
and these lines are parallel, whereas the scale S;3 is rectilinear only if a linear relation exists between 
f; and g3. By substitution one obtains the basic form f,(u) f3(w) + fo(v) g3(w) + A3(u) = 0. 

(6) One substitutes x, = a, x2 = b, x3 = c into the 
conditional equation (2) and obtains 


= (bys — cy2)[(b—c) or y(b— Cc) f 94-25 
= y3(b — a) + y2(a — c), that is y3(a— b)=y\(C— b)  -15-p w>- 3pw=29-0 20 
+ y2(a—c) using condition (1). With y, = f,(u)/(c —b), iia 
y2 = fo(v)/(a —c) and y3 = f3(w)/(a — b) the basic ~-7.0 1-75 
form /3(w) = f,(u) + f2(v) results, where the scales S,, 4-79 
S,, S3 are parallel straight lines. -05 
(7) General investigations have shown, that in addi- 0604 0201 “OS e 
tion to the ones derived, the following three basic ~- 0 ee — +—6 
forms are also possible: Ba +05 
“Ns , 
fy(w) = fi) fav), = 175 +10 
fi(u) fav) fal) = fia) + falv) + Faw) ah 20 i Ne 
and 2.25 2 , 0 
fi) fav) fh(w) + AW + fav) g3(w) + h3(w) = 0. +15 29 eve 
Example: The real solutions of the reduced cubic +20} °” +3.0 
equation w* — 3pw — 2q = 0 can be obtained from | sp +35 
a collineation nomogram with two rectilinear scales: +25- ee 
x,=0,y, = —3p;. x2=1,¥2= —2g and x3 = = C+), Jed 74.0 
‘5 = —w?/(1 + w), as can be seen from the basicform  +3.0 34 +45 
SP) F3) yt @ &3(w) + h3(w) = 0. From this one $9 +5.0 
can read +35- 32 jas 
fi(p) = — fig) = —24,  falw) = w, sagh 38: = 
gw) =1, hs(w) = w> (Fig.). "| 2 
#0. 
4.5 
29.6-4 Nomogram for the real solutions of the equation . +70 
w> — 3pw — 2g = 0 +5.0 +75 


Nomograms for more than three variables 


Instead of associating three points, one of each function scale, with one another by means of a 
straight line, one can look for other constructions for the association of a greater number of points, 
for example, one can associate with three points of the unique x, y-plane the centre of the circle 
determined by them, or of a triangle. 

For the functional dependence F(u,, u2, v1, 02, W1, W2) = 0, of at most six variables, intercept 
charts have been constructed, which connect three function lattices by straight lines. By a function 
lattice one understands two families of coordinate curves, for example, the curves r = const = V(x?+y7) 
and the curves y = const = arctan y/x of a polar coordinate system. Two values of the variables 
that are associated by this lattice determine a point with the coordinates x; =9,(u, ,U2), ¥; =Y1(1 U2). 
For the other two function lattices one then obtains in addition x. = 92(v;, v2), y2 = Y2(U1, V2) 
and x3 = 93(W1, W2), ¥3 = Y3(W1, W2). 

It is essential that for a function lattice there is a unique invertible correspondence between the 
points (u,, 2) and (x,, y1)- As in the case of collineation nomograms one requires that three points 
(u9, u9), (v2, v9) and (w?, w2) of the three function lattices (u,, uz), (v;, v2) and (w,, W2), that satisfy 


652 29. Numerical analysis 


29.6-5 Corresponding 
points of three function 
lattices lie on a straight 
line 


the given condition F(u9, u2, v9, v2, w2, w3) = 0, shall lie 
on a straight line (Fig.). In the Cartesian coordinate system 
common to all the function lattices the adjacent condition 
must then hold. 


If the equation to be nomographed contains 5 variables, then 4 variables can be represented by 
2 function lattices and one still needs one scale carrier for the fifth variable; for 4 variables, one 
function lattice and 2 scale carriers are sufficient. 

Of course, not every relation in 6 variables can be nomographed by an intercept chart. On the 
other hand, if one has obtained such a chart for a relation, then by application of an arbitrary trans- 
formation of the plane mapping straight lines again into straight lines one can obtain further solutions 
in the form of intercept charts. 


1 gy(uy, U2) Yi(uy, U2) 
1 92(%1, 02) Yya(v1, v2) 
1 93(W1, W2) 3(W,, W2) 


=0 


Example: The relation (a — 6) p3(w2) = [a — 93(,)] w2(v) ; ri oi a 
++ [p3(w,) — 5) y,(u) is equivalent to the adjacent determinant eer Mis a . 


uation. 
rons it one can at once find the scale equations, and the equations of the required function lat- 
tices are x) = a, ¥; = Yi(u); X2 = 5, Y2 = Pa2lv); X3 = P3(W1), Ys = Ys(W2). For the realization 
ordinary millimetre paper is sufficient in which two straight lines at a distance b — a apart must 
be numbered according to the functional scales y,(u) and y2(v). The same millimetre paper grid 
serves for the function lattice. Its axis perpendicular to the scale must be numbered according 
to the function ~3(w,) and that parallel to the scale according to y3(w>2). 


One often tries to nomograph relations with more than three variables by working with a chain 
of collineation nomograms. In such nomograms, for the given values of two variables one first 
determines from a collineation nomogram a value of an auxiliary variable, for example & (Fig.). 
For the value so determined and a given value of a third variable one determines from a second 
collineation nomogram a value of a further auxiliary variable, or of the solution variables, as the 
case may be. One continues in this manner, always by calculating values of auxiliary variables, 
until the value of the required variable is obtained from a final collineation nomogram. 


Us 29.6-6 Auxiliary variable é in 
12) the collineation nomogram for 
Ug = Uy, + Ua + Ug 


29.7-1 Calculation of the inte- 

1 
gral f (x) dx by the Monte Carlo 
method 


9/--94--94--0 
— 


29.7. Monte-Carlo methods 


The name Monte-Carlo methods is given to all procedures that make use of the concept of random- 
ness for the solution of deterministic problems, for example, the evaluation of integrals, the deter- 
mination of extreme values, the solution of systems of equations and the solution of ordinary and 


29.7. Monte-Carlo methods 653 


1 
partial differential equations. The example of the integral f f(x) dx used to illustrate the method 
0 


can be generalized directly to n-fold integrals; indeed, the value of Monte-Carlo methods only 
becomes evident for multi-dimensional problems. A finite interval of integration [a, b] can be reduced 
by the linear transformation u = (x — a)/(b — a) to the form [0, 1]. 


Probability procedure. The function f(x) is assumed to be bounded above and below so that it 
can be transformed to satisfy the condition 0 < f(x) < 1. 

The required integral is the value of the area of the region bounded by the curve f(x), the abscissa 
and possibly also by straight line segments parallel to the ordinate axis (Fig.). If the area and that 
of the unit square were uniformly covered with mass, for example, cut from cardboard of uniform 
thickness, then the ratio of the masses of the two could be regarded as an estimate of the interval. 
If one imagines the two areas uniformly covered by 7 points of equal mass, and if m, of them lie 
within the required area, then by counting one obtains m,/n as an estimate of the required area. 
Here the number of points 7 is at least 10*. Their uniform distribution must be truly random in both 
the x- and the y-directions, so that one deals with m mutually independent trials. Uniformly distributed 
random numbers make it possible to break off the procedure at a value n for which the successive 
estimates differ by less than a prescribed limit of accuracy. If (k — 1) is the number of counting 
steps executed and /,_, their result, then the recursive counting scheme 


Ty = Tye + (Eq — Teed /k = A — 1) Iie + Ea] /k 
has proved useful, where ¢, = 1 if the Ath point falls in the region of f(x), and &, = 0 otherwise. 


Mean value procedure. If only uniformly distributed random numbers x, in the interval [0, 1] 
are chosen for the argument, and f(x,) is calculated for each, then the statistical mean M[f(x)], 
multiplied by the width of the interval 1, is an estimate for the required integral. Because the arith- 


1 

metic mean is an effective estimate for M[f(x)], one obtains (1/n) Y f(x,) = J f(x) dx. Here the 
6 

recursive formula J, = J,1 + U(x) — 4_al/k = (h_1(k — 1) + f(x, ]/k has proved suitable. 


Generation of uniformly distributed random numbers. When digital computers are used, it is not 
customary to build a random generator into the computer for the generation of random numbers. 
Such a truly random generator has relatively little flexibility if great complications are to be avoided. 
Moreover, the stationariness of the generated sequence of random numbers is not guaranteed over 
long periods of time, that is, their statistical properties change in the course of the test. 

For this reason the random numbers are generated from deterministic recursive formulae. So 
that these pseudo-random numbers differ as little as possible from a sequence of truly random num- 
bers one requires a qguasi-independence of successive pseudo-random numbers and the non-occur- 
rence of periodic number sequences. Recursive formulae based on elementary number theory 
have proved to be particularly favourable. 

1. Reduced Fibonacci numbers [F, = F,_1 + Fy_2] mod m; 

2. Veen = [(22" + 1) y + c] mod 2” with r > 2 and c even; 

3. Vy = S* Y,_-1 mod m, for example, s = 23 and m= 10° + 1; 

r 


4. ¥, = LD) Cyy,_; mod m, where the yo, ¥1,---, ¥p_1 are Suitable numbers between 0 and m — 1 
j=l 


J 
and the c, for j = 1, 2, ..., r are suitable constants. 


Since al] these pseudo-random numbers are determined modulo m or modulo 2”, they are reduced 
by x, = y,/m or x, = y,/2™ to numbers in the interval [0, 1]. 


30. Mathematical optimization 


30.1. Linear optimization ............ 654 30.3. Dynamic optimization .......... 664 
30.2. Non-linear optimization ......... 661 


Optimization problems were already formulated by EUCLID, but only with the development of 
the differential calculus and the calculus of variations in the 17th and 18th centuries was a mathemat- 
ical tool forged for the solution of such problems. Optimization problems in economics are extreme 
value problems with auxiliary conditions, which are often characterized by the fact that the number 
of variables is very large and that non-negative solutions are sought. 


654 30. Mathematical optimization 


In general, an economic occurrence to be studied is regarded as a process composed of different 
activities, and it is the aim to obtain by abstraction a corresponding mathematical model. For every 
activity several variants exist and their realization is dependent on constraints g,(x;) = 0 imposed 
by the available capacity, so that not for every activity can the most favourable variant be chosen. 
Instead a combination of possible variants is sought for which a given objective function f(x;) for 
the overall process assumes a maximum or minimum value; g,(x;) = 0 and x; > 0 are called con- 
straints. 


For general non-linear optimization there are no restrictions on the given functions f and g;,; 
for quadratic optimization, f(x;) is quadratic in the x, and g,(x;) is linear; for linear optimization, 
f and g, are linear functions. 


30.1. Linear optimization 


In optimization it is usual to define the relation < ‘smaller than’ also for matrices A = (qa, ;) 
and B = (6,;) of the same order (see Chapter 17.) by setting 4 < B or A < B if and only if 
Q,;<. bj; or a,; < 5,;, respectively, for each i and j. Correspondingly A > B or A => B are defined 
by a,; > bi; or a,; > b,;, respectively. It is worth noticing that there may be two matrices of the 
same order for which none of the three relations <, >, = holds, whereas for two rational or real 
numbers in any case exactly one of these relations holds. 

If c and x are matrices of order 7 X 1 with 2 rows and 1 column, A = (a;;) a matrix of order 
m Xn, b one of order m x 1, O the zero matrix of order m x n, o the zero matrix of order n x 1, 
and cl the transpose of the matrix c, obtained from c by interchanging its rows and columns, then 
for linear objective functions and constraints, f(x;) can be represented by c!x and g,(x;) = 0 by 
Ax = b. When one puts c = —d, A = —B and b = —A, the problem of maximization changes 
into one of minimization. For a geometrical interpretation, x may be regarded as a vector in 
n-dimensional Euclidean space R,. 

According to the elements of the matrices c, A, 6 or d, B, h, various problems can be distinguished: 
deterministic problems if these coefficients are known constants, parametric problems if the coef- 


ficients (or some of them) can vary over known intervals, and stochastic problems if the coefficients 
(or some of them) are random variables. 


Examples: 1. Maximum gain. If the components x, of x are the piece numbers of a product 
or commodity in a manufacturing process and c, the yield corresponding to one piece of the product 
i, then x represents a manufacturing programme and c’ x its total yield, such as, for example, gain 
or proceeds in foreign exchange. Further, if & is one of the m activities, for example, a group of 
machines, 5, the available capacity (store capital), and if the coefficients a,, of the matrix A represent 
the level of activity per piece of commodity, then the problem of maximizing the manufacturing 
programme is to calculate the maximum proceeds, taking into account the availability of the 
given capacities. The assumptions on which the model is based imply that both the proceeds and 
activities are proportional to the amount of production and that the vector x of the piece numbers 
can have only integral components x, > 0. It is further assumed that the demand for the com- 
modity is unlimited. If this is not the case, limits d, in the sales can be introduced by the additional 
constraints x, <= d,. 

2. Diet problem. Let i be a nutriment of which the amount x, is contained in a food combination 
to be determined, and let d, be the cost per unit amount. Let & be a vitamin or nutrient substance 
of which the minimum amount A, must occur and of which the nutriment i contains an amount 
b,;. The solution leads to a minimization problem, which in this simple form, however, is applicable 
only to the cheapest combinations of animal feeding stuff. A different model for minimizing the 
cost of a food sequence of a hotel is obtained by additional refined and detailed assumptions, such 
as the daily distribution of meals, that is, breakfast, luncheon and dinner, the structure of a meal, 
for example, hors d'oevre, main dish and dessert, the offer of a choice, for example, of three dishes, 
and a smallest period of sequence of dishes, for example, of two weeks. 


The maximization problem in linear optimization was formulated in 1939 by KANTOROVICH, who 
solved it by the method of solution factors. The diet problem was solved approximately in 1941 
by CORNFIELD and in 1945 by STIGLER. The problem of linear optimization, which was formulated 
quite generally by Woop and DANTZIG, was solved by DANTZIG by the simplex method, which has 
been further developed in many respects. 


30.1. Linear optimization 655 


Simplex method. For the linear optimization the constraints consist of Ax < b and x > 9, or 
QyyXy tee Heayxy +> + QynXn <b, and x,20 for j= 1,2,...,.m and i=1,2,...,m (see 
Chapter 29.). Just as the condition 2x, + 3x, <4 defines a closed half-plane, so exactly 
(n + m) closed half-spaces are determined by the above constraints if x is interpreted as a point 
or vector in an n-dimensional space R,. If the (7 + m) constraints are consistent, then the inter- 
section R of the (n + m) half-spaces contains at least one point. Every point of R is a feasible solution 
or a feasible vector. 

The feasible region R of the problem, as an intersection of 
(n + m) half-spaces, is a convex polyhedron (Fig.). It is assumed 
that R is not empty and is bounded. 

The objective function f(x) = c'x can be interpreted geo- 
metrically by considering the surface f(x) = const, which 
represents a family of parallel hyperplanes clx =k in R,. It 
is required to find the hyperplane with the greatest k having a 
non-empty intersection with the convex polyhedron R. Clearly 
this forms a plane of support of R for this family, that is, a ~~ 
hyperplane that has a point in common with R. Accordingly oS, ee es a 
the maximum of f in Ran occur only at boundary points. 

With the above assumptions, R is the convex hull of its vertices. 30.1-1 Geometrical represen- 


Let x! (/=1,..., 5) be the vertices of R, of which there are at tation of a maximum problem in 
n+m a two-dimensional space R,; R 

most . Then every xe R can be represented by x = is the feasible region, cTx = k,,,x 
s n s : a plane of support, x° a feasible 

DA,x' with A, >0 and »/4,=1. But it then follows that basic solution 

i=1 i=l 


f(x) = cTx = cl( x A,x') = Dy A,cTx! = By 4,f(x'). Among. the 

s values f(x’) ‘there is # orcatest ay f (xo). It is then certainly true that f(x) = y Af (x!) 
< yA, f(x'o) = f(x'o). When R is bounded and non-empty, the optimization problem fae seduces 
to the determination of the vertices x' of R. In any case the solution is to be found among them. 


n n 
The m inequalities 3” a,x; < b, can be written in the form of equations )/ aj,x; + Xn4; = by, 
i= i=1 


Xn+1 
by introducing the m (so-called) slack variables x = ( i } If 7 is the unit matrix, one obtains 
a further form of the LO-problem. ee 


For the sake of simplicity one writes again max {c?x | Ax = b, x > o} (with suitably enlarged 
matrices), where A is of order m X (m-+ n) and x is of order (7 + m) X 1. It can be assumed that 
A is of rank m, since otherwise either the equations 4x = 6 would be inconsistent and no feasible 
vector would exist, or some of the equations would be superfluous, being linear combinations of 
the remaining ones. 

A vector x having exactly m positive components that belong to m linearly independent columns 
of the matrix A is called a feasible basic solution. 


The feasible basic solutions are precisely the vertices of the feasible region R. 


For the analytical proof of this theorem one uses the convex linear combination x = Ax! + (1—A) x? 
= A(x! — x?) + x? with 0< A< 1, which determines intermediate points x on the straight line 
segment connecting the points x! and x”. The vertices of R alone cannot be represented by a convex 
linear combination of two different points of R. If A has m linearly independent columns aj, ..., Gm 
and if a feasible basic solution is x! with x} > 0,..., x} > 0, x44, =x342=-:: =xh4n=0, then 
a convex linear combination x! = Ax? + (1 — A) x3 with two different feasible points x? and x? is 
impossible. Because x!,,, = 0 for tf = 1, 2,3 and r= 1,...,2, it would follow from Ax? = b and 
Ax? = b that A(x? — x*) = 0 with the trivial solution x? — x? = 0; but this means that x’ must 
be a vertex. 

On the other hand, if it is assumed that x! is a vertex with positive components x}, ..., xg, then 
the corresponding columns a,,...,a, of A must be linearly independent. Since A has m rows, 
k <m must hold, and it follows that x! is a feasible basic solution. For if the columns a a eeeyilly 


were linearly dependent, then numbers y,, ..., ¥,, not all zero, could be found, such that )' yja, = 0 
k k j=1 


k 
and for y >0 also y J ya; = o. Consequently, since of J’ x}a; + y D ya, = 6, by choosing a 
j-1 j=l j=1 


656 30. Mathematical optimization 


sufficiently ae number y, two vectors x? = (x! + yy,,...,x/ + yy%,,0,..-,0) and x3 = 
(xi — yyy, -- — yy,, 0, .-.., 0) could be constructed whose first k components are positive. 
But because of . representation xi = x?/2 + x3/2 with A = 1/2, then x! could not be a vertex, 
contrary to the first assumption. 

The degenerate case k < m is possible, but will be excluded here in these considerations. The 
degenerate case can be dealt with by means of the simplex method without special difficulties. 


Thus, R has at most (” ae "\= (” ca 
n m 


basic solutions, the one with the greatest function value k,,,, of the objective function must be 
determined. This point is not necessarily uniquely determined. This is the case, for example, when 
the boundary of R has an intersection of dimension d > 1 with the required hyperplane clx = Kimax- 

If one assumes that the first 7 columns of A are linearly independent and denotes the matrix formed 
by these columns by A, and the remainder by A,, then A = (A,, Az), where A, is non-singular 


") vertices. Among these finitely many vertices or feasible 


. So c x 
and of order m <X m and A) Is of order m X an. Similarly one partitions c = (), x= ear 
2 2 
where c, and x, each consists of the first m components. The equation Ax = A,X; + A2x, = 5b 
can then be solved for x,, giving x, = Aj!b-+ Aj1A2(—x2). Assuming that Ay1b > o, together 
with x, = a, this gives a feasible basic solution x‘. Substitution into the objective function gives 


f(x) = e{Az1b + [ef Azz — 3] (—x2). 
For x, = o the value of the objective function becomes f(x!) = eA; 15. These relations are presented 
in the so-called simplex tableau. 


The feasible basic solutions stand in 
the first column, and the last row 
gives the values of the corresponding 
objective function. 


Three mutually exclusive cases may be distinguished in the simplex tableau: 


1. The n elements of c! Az 14, — c} are non-negative. In this case an optimal solution exists, 
because if any element of x, is made positive, then the value of the objective function becomes at 
most smaller. 


2. ce! Az1A, — ct contains a negative element, say the kth; suppose that all the elements of the 
kth column of 471A, are non-positive. The kth component of x2 can then be arbitrarily increased. 
If because of x, = Aj1b + Aj!A2(—x2) the components of x, are changed at the same time, one 
always obtains feasible solutions for which the objective function increases beyond all bounds; 
f(x) is not bounded in the feasible region and the problem has no solution. 


3. The Ath element of cl Az1A 2— ra is again negative, but for each k the Ath column of Ajy!A2 
contains at least one positive element. In this case, too, one can increase the objective function by 
enlarging the kth component of x,. One may do this, however, only until the first of the decreasing 
components of x, = Aj!b-+ Aj!A2(—x2) assumes the value zero. The remaining (changed) 
components of x, and the Ath component of x, determined in this way form a new feasible basic 
solution with a greater value for the objective function. The linear independence of the correspond- 
ing columns of A can be deduced. Since only finitely many feasible solutions exist and since because 
of the increase of the objective function at every simplex step one obtains new basic solutions, one 
arrives after finitely many steps at case 1 (optimal solution) or case 2. 


Derivation of a first feasible basic solution. If 6 > 0, then introducing slack variables and starting 
from the requirement max (c’, 0) : with X = b one obtains a feasible basic solution. If no slack 


variables can be introduced, then one can definitely achieve 6 > o (and in every practical problem 
even 5 > 0). By means of so-called artificial variables y = (y1, ---,; Ym) One then first of all solves 
the problem 


m 
min { X y;| Ax + ly = 6, x >0,y> 0} 
j=1 


m 
for which y = 5 provides a first feasible basic solution. If the minimum of )’ y; is positive, then the 
j=l 


initial problem has no feasible solution. If the minimum is zero, then the optimal solution (x°, y°) 
= (x°, 0) of the last problem is a feasible basic solution of the initial problem. For a computation 
programme one chooses the last described procedure, which is problem independent. 

Duality. The on min {bTy | Aly> => c,y > 0} is called the dual problem to the problem 
max {cTx | Ax << b,x2> o}. The latter is also known as the primal problem. Here y is a matrix of 
order m X 1 or a vector in R,,. A vector y with ATy > c and vy 2 0a is called feasible. 


30.1. Linear optimization 657 


If x and y (primal and dual, respectively) are feasible, then c'x < bly. 


Proof: x feasible means that Ax < b and x > 0; y feasible means that ATy >c and y > o. Thus, 
clx < (ATy)T x = (yTA) x = yl(Ax) < yTb = Bly. 
From this it follows without difficulty that: 


If x° and y° are feasible and if c’x° = bTy°, then x° and y® are optimal for the primal and dual 
problem, respectively. 


Proof: By the above theorem, for every feasible x, clx < bTy® = cTx°, and on the other hand, 
for every feasible y, bly > cTx° = bly°. The first inequality shows that x° is the solution of the 
primal problem and the second that y° is the solution of the dual problem. 

GALE, KUHN and TUCKER have proved the following duality theorem. 


Duality theorem: x° is a solution of the primal problem if and only if there exists a feasible y° 
such that c’x° = bTy®; y® is a solution of the dual problem if and only if there exists a feasible x° 
such that bTy° = clx°. The primal and the dual problem are soluble if and only if both have simul- 
taneously feasible vectors. 

These statements are of use especially if a problem can be solved only approximately and one 
would like to have an estimate of how far one is away from the optimum. This can also be important 
when a computer is used and to keep computation costs within economic limits one has to break 
off the computations. 

Example 3: To maximize x, + 4x, under the constraints shown. — 
Using the slack variables x, and x4 these and the objective function 
f(x) become the equations shown below. 

From these one obtains: 


1023 Lace _ a _ {2.3 
A=(9, 3 i ss Na =(( :)s A. =(; i) 
ra — (0, 0), ch = (1, 4), x} a (X3,%4)s 4 — (x, » X2) 
and hence the new equations and the simplex ne est Dae er oe ee 
tableau S,. Since the inclusion of x, into the | “= me 9c ae = <. 
basis resulted in the greatest increase in the , ag va 
value of the objective function, x, will f(x}) = wort ik! lh) A Soo) ec 4) ( X2). “| 
be used. Through the equation x, = 4 — 2x, — 3x2, x, = 4/3 is 
determined, since this implies x, = 0, which thus leaves the basis. 
From the modified equations for x,,.x4 and the value of Aj!Az as 
well as cl = (4,0) and c} = (1,0), a new form of the objective 
function f(x) = 4x2, + 0+ x4 + x, + 0° x; can be determined: 


(x7) = 16/3 + [8/3 — 1] (—x,) + [4/3 — 0] (—x3). 


| 1. "X3 +0" Xe + 2X1 + 3x,=4, | 
0: x3+1:- Ma + 3x, + lx, = 3, | 
Orx, + O+xg + xy + 4x2 = f(x) | 


Ws wll 


4103) x2 = 4/3 — 2x,/3 — x3/3, 
\ 4 = 5/3 — 7x, /3 + di 
re ba 2/3 1/3 
bx, =) 14, = 
sl: feud S/n) eee Az (is _1/3) 


1 | 4/3) 2/3 1/3 
5/3 | 7/3 = 
30.1-2 Solution of a primal 


problem with the objective 
x, function f(x) = x, + 4x, 


One obtains the simplex tableau S,. The optimal solution of the problem occurs for x2 = 4/3, 
X4 = 5/3, x1 = x3 = 0. In Fig. 30.1-2 the feasible region R of the original problem without 
slack variables is shown, as well as the straight line x, + 4x2, = 16/3. 


Shadow prices. In the form of the objective function that corresponds to the optimal solution, 


the coefficients cl AZ 14, — ce} belonging to the slack variables form the solution vector y°® of the 
dual problem. In the example yo! = (4/3, 0), and hence b7y° = 16/3 = f(x°). The components of 
this dual solution vector are also called shadow prices. They indicate the extent to which the objective 


658 30. Mathematical optimization 


function is increased when the corresponding component of 6 is increased by unity. In the example 
an increase of b, = 3 would not achieve anything, since in the optimal solution x, > 0, and 
accordingly 3x, + x2 <3 is not affected. On the other hand, an increase of 4/3 would result if 
b, = 4 were replaced by 5, = S, as can easily be verified. 


Application of the simplex method. The simplex method has been improved in many respects 
with a view to reducing the rounding error, the required computer store and computer time. 

LEMKE in 1954 developed the dual simplex method, in which he solved the primal problem via 
the solution of the dual problem. To save computer time one has also combined the primal and dual 
simplex. The revised simplex method is frequently used in conjunction with a representation in product 
form of the inverse matrix. For extensive problems one reverts by te-inversion to the data of the 
original matrix after a certain number of simplex steps, to reduce the rounding error. 

For primal problems with an upper bound for the variables (x < d) DANTziG developed a special 
algorithm whereby the amount of computation is comparable to that of the problem without such 
bounds. Finally, for problems with a special matrix structure, where only the cross-hatched area 
is occupied with non-zero elements (Fig.), DANTZIG and WOLFE developed in 1960 a decomposition 
method, in which the total problem is decomposed into a main and several auxiliary problems. In 
this way problems with 32000 constraints and 2 million variables have already been solved in 
justifiable computer times before 1963. 

The method of the fictitious game suggested by BROWN and ROBINSON, which requires less store 
space than the simplex method, converges too slowly to become practicable. 

The transport problem, which is an important special case of the primal problem, was formulated 
independently by HITCHCOCK in 1941 and KANTOROVICH in 1942. 


30.1-3 Decomposition 
method; scheme of a fy | 


(4 (i | matrix in which only the 
Coq, LS, elements indicated are Vj 
——* different from zero s = 


/) | V2 
Bo: ate WLLL 30.1-4 Transport ok 7 


optimization 


The following meaning can be ascribed to it: For a certain aE TSG there exists a supply L, 
with a supply capacity a; > 0 and a consumer V, with the requirement b, > 0. To transport one 
unit of the commodity from the supplier to the consumer costs c,; units of money. The quantities 
xi,, which are transported from L; to V; (Fig.), are to be determined in such a way that the total 
transport is as cheap as possible. The matrix 4 now has a special structure such that integral values 
of 5, and aq, result in integral solutions for x,;. Moreover, this special problem always has a solution. 
It has been used in practice very effectively and in many ways. Apart from the usual form of the simplex 
algorithm, one should mention the solution procedure by the Hungarian method due to KUHN, the 
stepping-stone method of CHARNES and COOPER and a method of Forp and FULKERSON, which 
makes use of the determination of the maximum flow in a directed graph for the solution of the 
problem. In this way one can solve the transport problem, also taking into account the limitations 
of the capacity of the transport routes. 

Further generalizations distinguish between several transport stages or examine the transport 
of several commodities. 


Example 4: Let 6, = 4, 6, = 8, bs; = 2, 64 =8 be the requirements of 4 consumers and 
a, = 10,a, = 7, a, = 5 the capacity of 3 suppliers, so that Ya, = J 6b; = 22. Should one of the 
cases J' a, =a @b= 3’ b, occur, one assumes a fictitious supplier for the amount (6 — a) or 
a fictitious consumer for (a — 4), as the case : 
may be. The coefficients c,, of the matrix are the 
costs per piece for the transport from / to /. 


30.1. Linear optimization 659 


By the North-West corner rule one attempts, starting from the North-West corner on the top left- 
hand side, to satisfy the given requirements according to the sums of 5, and a,, respectively, from 
field to field in a maximal manner. The sequence of the fields with decision is indicated by red 
numbers, and those fields where at the same time the piece number has also been decided are 
marked by red arrows. 


North-West corner rule Minimum matrix procedure 


—— 
rea et Lee 


by] 4 = ae naar 5 
eee 1 eee Wie? s Ce 


; af & el a 


Using the minimum matrix procedure one also satisfies maximally according to the sums a, and 5,, 
but the fields with decision are fixed in each case according to the smallest value of c,,, in order 
to save transportation costs. As is to be expected, the value of the objective function is now 
smaller; the North-West corner rule gives /, = 162, and the matrix minimum /, = 120. For the 
optimal solution x,3 = 2, X;4 = 8, X22 = 7, X3, = 4 and x32 = 1 one obtains f,,, = 78. The 
solution is degenerate, since only 5 = n + m — 2 positive components occur, whilst because of 
the additional condition a = b, a total of nm + m— 1= 6 positive components is to be expected. 


Integral optimization. In connection with the determination of a production programme which, 
in the presence of capacity restrictions, secures maximum proceeds, the problem of an integral 
solution of max {cTx | Ax < b, x > 0} occurs. One searches for the maximum of the objective 
function no longer among all the points of the feasible region R, but only among the Jattice points 
contained in R. 

For each concrete problem one must, of course, examine whether a treatment as an integral 
problem is really necessary. For example, if the coefficients of the problem are only estimates, then 
the additional labour does not pay. The inaccuracies introduced into the problem by the initial 
data are not appreciably increased by the normal procedure and the rounding off of the non-integral 
solution to a neighbouring lattice point. 

For integral optimization the method of the cutting plane was developed by Gomory in 1958. 
The problem is first of all treated in a normal manner until an optimal solution is obtained. If by 
chance this solution is not integral, then an additional constraint is introduced in such a way that the 
resulting solution is no longer feasible in the primal sense, but the remainder of the feasible region 
R, © R contains all lattice points of R. The hyperplane, which cuts something off from R, has 
given the method its name. By its introduction the solution is not feasible as a primal solution, but 
remains feasible as a dual solution. In this way one can, by the dual simplex method, arrive relatively 
quickly at an optimal solution relative to R, . If this is not integral, a second cutting plane is introduc- 
ed, giving R. © R,. This procedure leads to the desired integral solution after a finite number of 
steps. 

In practical applications some problems arise because of rounding errors resulting from the 
finite calculations. Since the cutting planes that pass through the lattice points can be determined 
only approximately, certain limits of variation must be admitted when testing for the integral 
character of the solution. In spite of this it can happen that lattice points are cut off from R. This 
has led to many further studies of the problem, and the application of the method remains problematic 
as far as the technique of computation is concerned. 


Parametric optimization. All coefficients in linear optimization can be dependent upon parameters. 
The simplest cases are 


This problem arises when a production programme has to be determined which maximizes the 
gain for a given capacity, and when only bounds can be given for the gain per item of production, 
or when the question has to be answered as to how the optimal solution changes as a result of changes 
in production gains, for example, in sales prices. This is the first formulation. The second formulation 
answers the question as to changes in the production programme as a result of change in capacity, 
either through extensions or limitations. Here one can obtain, as was pointed out above, certain 
information from the components of the dual problem, the shadow prices. 


660 30. Mathematical optimization 


The two formulations are mutually dual: the problem dual to the first formulation has the struc- 
ture of the second. Hence the first formulation only will be considered in some detail. It is assumed 
that the feasible region R is bounded and not empty and that c and d are linearly independent. 
(For linearly dependent c and d a general primal problem results.) The hyperplanes (c + Ad)T x = k 
form for every fixed A a family of parallel planes with parameter k. For a fixed k and variable A 
one obtains a pencil of hyperplanes. Fig. 30.1-5 illustrates for two dimensions the situation, 
which can be established quite generally: If one draws the vector normal to the hyperplane 
(c + Ad)’ x = k on the side of increasing k, then for —co << A< + oo this vector sweeps out an 
angle of magnitude x. For the example of the figure, the A-interval (—oco, +00) can be split up into 
four parts —co CA<A, <0, 4, SA <A, <0, AZ SA <AZ(A3 > 0), Az SA < +00, for which 
E,, E3, Ez, E,, respectively, is the optimal solution. One can find quite generally a decomposition 
of (—oo, -+cc) into finitely many partial intervals such that for each A-interval a feasible basic 
solution is optimal. For unbounded R no solutions need exist in the intervals —co <A <A, or 


A, <A< +00. 


30.1-5 Parametric optimi- 
zation in the plane 


30.1-6 Geometric interpre- 
tation of the dual parametric 
problem 


In practice one solves the problem for a certain A by the simplex method and then determines 
the A-interval for which the optimal solution obtained remains optimal. At the boundaries of this 
interval one can determine those components of x that for increasing or decreasing 4 move into the 
basis, and likewise those that leave the basis. For this new basis one again obtains a A-interval, 
and so on. Several procedures exist that are particularly suitable for this, for example, one proposed 
by SPURKLAND (1964). 

For the dual problem a variation of A means a parallel displacement of the hyperplanes enclosing 
the feasible region R (Fig.). It can be shown here that for certain /A-intervals the same components 
of x are in the basis, that is, are positive. The solution itself, however, changes with A. The problem 
of the maximization of the gain of a production programme for given capacities suggests a further 
parametric problem. In Example 1 the element a,,; of the matrix A has represented the activity per 
unit of commodity i in the activity group k. These coefficients could also change, for example, by 
an increase in productivity or through the introduction of new technologies. Thus, the study of 
problems with more than one parameter is of interest, and a number of recent investigations show 
certain generalizations of the results that are known from one-parameter optimization. 

Of course, with every practical problem the question arises of how a change in the coefficients 
influences the solution. In the case of bounded capacities the shadow prices give information about 
this, but the question can be answered in full generality only by parametric optimization. 


Further applications of linear optimization. Linear optimization can be applied in many areas 
of natural sciences, technology, and economics. It is one of the effective mathematical methods of 
operational research. A few special cases of application will now be indicated. 

Coordination problem. n agents are to be associated with exactly m tasks in such a way that each 
agent is associated with exactly one task and the total effort of the total cost is to be minimized. 
For example, a mechanical production process consists of 7 tasks and there are n workmen each 
of whom can, in principle, do each of the tasks, but in varying times. The times required by the 
workmen form a square matrix with elements c,;. Each workman is to complete exactly one task. 
The mathematical formulation then assumes the form 


n n n n 
min {2 X~ cutis | 2 Xij =< Xig= l,xyS 0}. 


This is a special case of the transport problem. Exactly n of the x, ; are equal to 1 and the remainder 0. 
The basic solution contains, instead of 22 — 1, only 1 positive components, that is, the problem is 
strongly degenerate. Kuhn’s method is therefore more suitable for the solution than the simplex 
method. 

Mixing problem. A typical mixing problem has already been mentioned in connection with the 
diet problem. The loading of a blast furnace for the production of cast iron is a second example. 


30.2. Non-linear optimization 661 


One looks fo the cheapest mixture of ore for the production of cast iron with definite properties. 
The production of a gas with prescribed calorific value by the mixing of gases of different manufac- 
turing costs and known calorific values also leads to a problem of linear optimization. 

Cutting problem. If sheets of given size are to be cut into different kinds of smaller sheets, then the 
question of redu ‘ing the waste to a minimum leads to a minimization problem. Such problems 
arise in the manufacture of metal, wood, textile and leather goods. 


Stochastic linear optimization. A linear optimization problem whose coefficients are random 
quantities, max {ceTx | Ax < b, x > o}, is meaningful only in the sense of the maximum expectation 
of the objective function. If the pdm character of a problem is fully taken into account, it becomes 
in most cases either non-linear, or it must be represented or approximated by problems with piecewise- 
linear objective functions and linear constraints. There is only one very special case in which the 
resulting problem remains linear, namely when the components of c, that is, the gains, or for a 
minimum problem the costs, are random quantities whose distribution functions are independent 
of the values of x,. It is then permissible to replace the random components c; of ¢ by their expectation 
values €,; = E(c,) and to treat the primal problem in a deterministic manner with the vector ¢ formed 
from the ¢;. This is possible, for example, in the treatment of the problem of minimal cost of fodder 
(diet problem) if one has to assume for the year under consideration such a randomness with known 
distribution for the prices of the individual types of food. The problem becomes more complicated 
if the components of 6 are random quantities, for example, if in the case of the optimum costs of 
a store of spare parts the demand is random, or if the elements of the matrix A are random. 


30.2. Non-linear optimization 


Among the non-linear problems, convex optimization can claim a certain completeness, at least 
as far as the theory is concerned. 


Convex optimization. A region B of the n-dimensional Euclidean space R, is called convex if 
for every two points of B all points that are convex linear combinations of them also belong to B 
Sees A function f(x) defined on a convex region B is called convex if for x‘ and x? in B and 
0<4<1 

fax! + (1 — 4) x?) < f(z") + (1 — a) f(x?) 
always holds (Fig.). If for x! + x? the sign of equality is excluded, then f(x) is called strictly or 
properly convex. A cube, parallelopiped, tetrahedron, sphere and ellipsoid are examples of convex 
regions in R3. 


30.2-1 Convex sets M,, y 
M, and non-convex sets 
M,, M ae ay 
2, M, Sad for?) 
ee al 
NAXe(T-A)X3) 


| 
| 
| 
| 
| 
| 


30.2-2 Graph of a convex 
function of one variable 


’ Ax'e(T-A)x? xe x 


The intersection of convex regions is a convex region, a property that has been used already in 
considering the feasible Raa R of the maximization problem. If the functions f and g, in 
min {f(x) | g(x) < 0, x; > 0} ((=1...2, j= 1...m <n) are convex functions, then one speaks 
of a problem of convex optimization. The feasible region R of the problem is a convex region in 
R,,. For the existence of a solution KUHN, TUCKER and SLATER have proved a fundamental theo- 
rem, the saddle point theorem. It refers to the saddle point (x°, u°) of a function F(x, u) of two 
variables which for # assumes a maximum at this point, so that for values in the neighbourhood 
of u° it decreases, but for x has a minimum, and thus increases for values in the neighbourhood 
of x°. This function F(x, u) is obtained by the method of Lagrange ha sa al for extremal prob- 
lems with side conditions. Using the Lagrange multipliers u, with j = 1, ..., m the generalized 


Lagrange function F(x, u) = f(x) os z u,;* g(x) is defined in which wu is the m x 1 matrix formed 
from the u,. 


662 30. Mathematical optimization 


Saddle point theorem. If an x' > o exists with g)(x') << 0 for j/ = 1,...,m, then x° > 0 is a 
solution of min (f(x) | £ ors 8 x, = 0) if and only if for all x > 0, # > 0.4 u° > o exists with 
F(x®, uw) < Fle, a) < Fx, w) 

Accordingly the function F 7 u) has at (x°, u°) a non-negative saddle point. It will now be 
shown unat this is sufficient es x°>o to bea soln of the convex problem. One obtains 


S(x°) + z Uj 8 (X°) < f(x?) + z use (x) < f(x) + 2 uedx) for all x2>o and all uLbpo. 


From the left-hand inequality it follows that £)(x°) <0 for j= 1,...,m; for with a positive 
84, (x°), taking uj, > 0, u; = 0 for j + j,, the left- pane side could be made to increase without 


limit. Hence x° > o0 is feasible. PUTHIEEDOTS; ps usg;(x°) = 0, since otherwise the left-hand 
inequality would not hold for «= 0, because all &fx°) << 0 and u° > o. It thus follows that 
S(x*) < f(x) + = usg(x) for all x20, and consequently f(x°)</f(x) for all x20 with 


&(x) <0 VG = 1, .,m), Since u° > o. But this means that x° > oa Is a solution. 

A complete proof of the saddle point theorem as stated here is due to SLATER. KUHN and 
TUCKER have proved it for differentiable functions, and for such functions the local Kuhn-Tucker 
conditions, which are equivalent to the conditions of the theorem, are stated below. These necessary 
and sufficient conditions for a solution of the convex optimization problem are utilized in many 
applications — especially in the method of quadratic optimization. 


Local Kuhn-Tucker conditions. For convex differentiable functions f(x), g(x) the existence of an 
x° > o0 and au® > o with 


OF (x®, u®) st OF (x°,u) OF (x®, w®) oT OF (x*,u) 
— aa ie o,x° —) dee 0, a ne gee ee om a =@Q 


is necessary and sufficient for x° > o to be a solution of the convex problem. 


Quadratic optimization. The example given above of the determination of a manufacturing 
programme with maximum gain for a given capacity was a problem of linear optimization. The 
components c; of c were the gain per unit produce i. The gain is the difference between the achieved 
price p; and the costs k;. The composition of k,, as well as that of influencing factors that could 
alter p;, will not be considered here in detail. The assumption that both p; and k,; are independent 
of the piece number x, of the product 7 is a great simplification. If it is assumed that a discount 
is offered for selling large numbers and that the cost per item decreases with increasing number, 
then the situation can be expressed approximately by pi = Di — MiXts k, = k,; — s,\x;. One thus 


obtains the quadratic objective function c7x = 2 (p; — ki) x; + Z (s; — r,) x}. 


In its complete form the problem of quadrane eprinizatian can be written as 
min {cTx + xTCx|Ax < b, x > o}, where C is a symmetric square matrix of order 1 x n. If C is 
positive definite or semidefinite, then the problem is convex and the Kuhn-Tucker conditions are 
applicable. Here, too, a maximum problem can be changed into a minimum problem by changing 
the sign of the coefficients of the objective function. In this case, to obtain a convex problem the 
Square matrix in the objective function for the maximum problem would have to be negative 
semi-definite. For the manufacturing programme the special case arises in which C is a diagonal 
matrix, which for s; — r; < 0 would be negative semi-definite. This, too, would be an approximate 
representation of the real process, and it may be useful to repeat the observation that in a specific 
case one should always examine whether the advantage of a ‘better’ model justifies the additional 
work compared with that of a linear model. 

The Lagrange function for quadratic optimization assumes the form 


F(x, 4) = clx + xTCx + ul(Ax — b) so that of =e-+2Cx-+ Alu, o = Ax — Bb. 


Ox 
With ae =v and Beis = y one obtains the conditions: 
Ox Ou 
(1) Ax+ y=); (2) 2Cx —v+ Alu =—c; 


(3)x>o,vl>0ySloulo; (4) xlvo=0, ylu=0. 


A vector xER, is a solution if and only if, together with a vE R,, a wE R,, and a ye Ry, it 
satisfies the conditions (1) to (4). (1) to (3) form a linear system. The condition (4) can also be written 
as (4a) xlv + yTu = 0, since by (3) the vanishing of each individual term in the scalar product is 
required either by (4) or by (4a). Hence this condition states that a feasible solution is required 
for the linear system (1) to (3) for which at most one of the corresponding components of x and 2, 


30.2. Non-linear optimization 663 


and similarly at most one of the corresponding components of y and u, may be positive. Altogether, 
at most n + m components of the four vectors may be positive, that is, exactly the same number as 
there are equations in (1) and (2). The solutions of the system (1) to (4) are thus contained among 
the feasible basic solutions of the first three conditions. These feasible basic solutions can be deter- 
mined by means of the simplex method. 

For taking the last condition into account there are two possibilities: 

The method of Wolfe. One introduces additional variables into (1) to (4) in such a way that a 
feasible basic solution for (1) to (3) satisfying (4) can be found without difficulty. One then proceeds 
with the simplex method in such a way that this condition remains satisfied and the additional 
variables are removed from the basis. 

In the methods of Barankin-Dorfman and of Frank-Wolfe one begins with a feasible basic 
solution that does not satisfy the last condition and proceeds with the simplex method with the aim 
of minimizing the expression xTv + yTu. 


Method of Frank-Wolfe. With z? = (xT, yT, vl, uT) and zl = (vl, ul, xT, yT) one can write 

. _ _ _ A 
the Kuhn-Tucker conditions in the form min {Z7z | Az = b, z > 0}. Here A = ( 2C in ae fn) 
and b= (_*). where /,, and 7, are unit matrices of order m and a, respectively. Because 


zlz = 2(xTv + yTu), a solution of the original problem can be obtained as the x-portion of a 
solution z> of the transformed problem with Z7zo = O. 
A feasible basic solution z, can be determined by methods known from linear optimization. 


FRANK and WOLFE then consider ziz with this fixed z, as objective function of the transformed 
problem. In this way the problem is linearized and the simplex method can be applied. If one obtains 


an optimal solution z, of this problem with zz, = 0, then one has finished. Otherwise the following 
procedure is suggested. One continues with the method until one obtains with the basis z, either 


the relation zz, = 0, in which case one has finished, or else zz, <}/ oz z;. In the second case 
FRANK and WOLFE give a method for the construction of a new z,. They show that one of the cases 
always occurs and that by the use of their method with a positive semi-definite C, the first case 
always occurs after a finite number of steps so that one obtains a solution. 

OF OF 


Method of gradients. From the definitions v = On we it follows that G(z) = Z'z is a 


convex function. The linearization at the point z, is performed in such a way that G(z) and the 


linear replacement function H(z) = zz have the same gradient at the point z, . In this way one arrives 
at a new group of procedures, the method of gradients, which can be applied in quadratic as well 
as non-linear optimization. For a differentiable function f(x) with xé R,, the gradient vector 
grad f= as is perpendicular to the surface f(x) = const and points in the direction of maximum 
increase of the function f(x). In order to minimize a given function without side conditions one 
starts from a given point xg and proceeds in the direction of grad f(xq). If the function has a unique 
minimum, as for example for strictly convex functions with a finite minimum, then an iterative 
application of this method will certainly lead to the desired result. For a problem of convex optimiza- 
tion care must be taken, of course, that one remains in the feasible region. 

The method of gradients can be described quite generally as follows: Starting from a feasible 
point x9 one determines a direction such that, at least initially, one remains in the feasible region 
and that the value of the objective function decreases as rapidly as possible. One proceeds in this 
direction until either the objective function ceases to decrease or the boundary of the feasible region 
is reached. The point x, reached is used as the starting point for the next step. The various methods 
differ only in the way the direction of the steps is chosen. 


Method of feasible directions of Zoutendijk. Let the problem be min {f(x) | Ax <5}. Suppose 
that the given constraints also-contain sign limitations for x. If a’, are the rows of the matrix A 


and b, the components of 6, then the constraints can be expressed by a\x < b, for j = 1,..., m. 
Let the objective function f(x) be convex with continuous partial derivatives in the feasible region R. 
The method described above of proceeding from a point x* in the feasible region to the next point 
x*+! is carried out in the following form: One travels in the direction determined by a vector r* 
along the ray x* + Ar*, where the direction of r* is determined in such a way that for A > 0 the ray 
remains at first in the feasible region. If x* is an interior point of R, then this is no limitation. If 


x* lies on the boundary of R, and if J is that index set for which al xk = b, holds, then it is neces- 


sary and sufficient for the choice of r* in the above sense that al rk <0 for j¢€J. Such a direction 
is called feasible. In addition one aims to reduce the function f(x) along the ray as much as possible; 
f(x) will be reduced for all r“ with grad Tf(x*)- r* <0. One writes grad f(x*) = ¢ and determines 


664 30. Mathematical optimization 


r* as the solution of the linear optimization problem min {e'r | a'r < 0 for j € J}. Since, in general, 
clr is not thereby bounded below, a feasible condition is still added. If one chooses —1 <r; < 1 
(i= 1,...,) for the components r, of r, then by the simplex method an r* favourable for x* can 
be determined. One goes along this ray x* + Ar* until either f(x) becomes minimal, that is, as far 
as A, with grad Tf(x* + A,r") - r* =0, or until a point is reached at which the ray leaves the feasible 


region. The corresponding value of A, is determined from A, = max {A | a} (xt + Ar*) = bj} for 
j =1,..., m. By repeating the procedure described, the next point of approximation x*+! = x* + A,r* 
with A, = min (A, , Az) is obtained. If A, is not finite, then f(x) has no finite minimum. ZOUTENDIK 
has established the convergence of the method. In the special case of a quadratic function f(x) 
it can be achieved by an additional rule that the procedure comes to a conclusion after finitely 
many steps. 

Besides quadratic optimization there is a special form of non-linear optimization for which in 
recent years satisfactory methods of solution have been given and for which also a duality principle 
holds. These are problems in which the objective function is a quotient of two linear functions and 
the constraints are linear inequalities. 


30.3. Dynamic optimization 


The basic idea of dynamic optimization will first be illustrated by a simple example. A particular 
vehicle (lorry, goods wagon) is to be loaded with different objects of different kind; 7 is the number 
of kinds s; (i = 1, ..., 2) of objects, v; > 0 the price of the objects, w, > 0 the weight of the objects, 
u; > 0 the number of loaded objects of kind i, and z the total capacity of the vehicle in question, 
where z= min {w,} must hold. The problem consists of determining the numbers u, in such 


im], ..,” 
a way that a load with maximum price is achieved. 
The problem thus leads to the following exercise in optimization: to determine the maximum 
of the objective function 


n 
Sf (uy, ---, Un) = D; vi4; with the constraints 
i=l 


n 
u; 20 (i= 1,...,m) areallintegersand S w,u; < z. 
i=1 


This problem may be regarded as an 7-step process, in which at every step a u; is determined, such 
that the required maximum is reached in the last step. The whole optimization problem is thus 
transformed into an event in time, that is, into a process. 


Discrete deterministic processes. Let S be a system, for example, of an economic, mechanical 
or chemical nature, the state of which in the time interval [t’, t’’] can be described by the n functions 
x, = x(t) ( = 1, ..., 2), where the set of the possible states x7 = (x,, ..., x,) (in the time interval 
in question) lies in a given point set X of the n-dimensional Euclidean space. The components of 
xT are called state variables. 

For the definition of a discrete deterministic process the following assumptions are needed: 

Let t= ty Oe <0 tp << big << tng =" be a given partition of the interval [t’, t’’]; then 
the state x'+! = (x,(t;), ..., Xn(t;)) of the system remains unchanged in the right-open time interval 
(ti, tiz1) (@=1,...,”). The state x't! depends only on x! and a certain decision e', that is, 
xt! = T'(x!, e!) ((= 1, ..., 2), where T‘ (the transformation of the state in the ith step) is independent 
of earlier states. 

The decision e' is uniquely characterized by a certain n-dimensional vector u! = (u,, ..-, Un); 
where the points #' must lie in a prescribed region U'. Each point u! € U! is a permissible decision 
vector of the ith step of the decision process. Every sequence P = (u}, ..., x") witha! € U' (i = 1, ..., 2) 
and x!+! = T'(x!, uw!) e X for i=1,..., is called a permissible strategy (or steering) of the n-step 
decision process in question. Because the changes of state of the system S occur only at discrete 
points of time, and because the quantities involved are not random, all processes of this kind are 
called discrete deterministic processes. 


Optimal strategy. For the discrete deterministic process of m steps a certain function 
f(x}, ..., x", x"), w!, ..., u") is given, which is defined over the domain x! € X (i = 1,...,), uv! € U! 
(i = 1,...,) and is called the objective function. If the initial state x! is given, then the objective 
function f can be expressed as a function of x!, uw, ..., 4”. This follows from the assumptions on 
the process, so that 


f(x}, T(x", wu’), T7(T (x), uw’), u?), ..., ut, ..., uw") = f(x}, wl, ..., u"). 
The optimization problem then consists of finding a permissible strategy Po = (uj, ..-, u@) for 


a given initial state x! with the property f(x!, uj, -.., 4%) = max f(x’, w,..., 2”), where {u', ..., u"} 
{u',..., 4} 


30.3. Dynamic optimization 665 


runs through the set of all permissible strategies. If such a strategy Po exists, it is called an optimal 
strategy. The method of dynamic optimization presupposes a certain property of the objective 
function, the so-called Markov property. 


It happens that for most decision processes in practical applications the class of separable func- 
tions, also called the class of functions of additive character, is adequate. They are functions that 
can be written in the form 


n 
I i Oe ig) S x g(x', a’). 
{= 


These functions then have the Markov property. 
Under the assumption that for the (deterministic, discrete) n-step process described above with 
the separable objective function f the optimal strategy exists and hence also the optimal solution 


n 
of the corresponding optimization problem, one introduces the notation f,(x!) = max » e(x', a’). 
{u*,...,u")} i=1 
Bellman’s principle of optimality. The method of dynamic optimization is based on the fact that 
instead of the original problem with fixed initial state x! and fixed number of steps n, a number 
of problems is considered. The value f,(x1) is thus regarded as a function of x! and n. If one imagines 
that the value f,(x') has been calculated by any method whatsoever, then it is easy on the basis of 
the definition of f,(x’) and the separable character of the objective function to derive recursively 


the formula : 


f(x") = max max... max {JX g,(x!, u')} = max {g,(x!, wu!) + fy_1(x?)}. 
weu! = ytcu® u"EU" i=1 weu} 


But since x? = T'(x!, u'), the recursive formula goes over into the relation 
f(x?) = max {gi(x*, wu’) + fy_s(T7(x!, w1))}. 
weu 


This relation can also be derived on the basis of Bellman’s principle of optimality, which contains the 
basic idea of dynamic optimization. 


In the literature on dynamic optimization a reverse numbering has emerged, with x"+1 denoting 
the initial state and w” the first decision in an n-step process. With this notation the transformation 
of states is described by x! = T'(x!+1, u'), i=1,....”, with x" = T(x, u"), x = x"+!, and the 
recursive formula, which corresponds to Bellman’s optimality principle, goes over into the relation 


f(x) = max {&n(%, a") + fr_i(x")} = max {8q(, u") + fnr_i (T(x, u”))}. 


Example 5; The problem mentioned at the beginning of this section can be regarded as a discrete, 
deterministic process of n steps, where with the natural numbering ui = (M,) denotes the decision 
in the ith step, (w', ..., w") = (u,,-.-,U,) a permissible strategy with the given properties, and 

i 


xitl = x! — yw, = z— XY wy, (i=1,..., m), where x! = z, the state in the ith step (free loading 
space). Then by the principle of optimality 
SAz) = fx’) = max {uyv, + faiilz— uy)}, + f,lz) aie ead: ‘qu 
pace temic 
or, changing to the reverse numbering, as is customary in dynamic optimization, 
fAz)= max {u,t, + Sn-i(Z — UnWad}, ++ f(z) = max Ugh,- 
lin€ (0,1, ...} tn€ (0,1, ..-} 
MnWn< 2 MnWn<e 
Clearly g,(u;) = u,v;. 
Let the adjacent numerical values be given: n =3, z =/100, 


wy = 40, w, = 45, w, = 60, 
5: = 30. og = FS, 0y:= 102; 


666 30. Mathematical optimization 


In this case 
8(u;) = 20u,, 22(u2)= T5u2, g3(u3) = 102u3, f= 20u, + 75u2 + 102us3, 
and the constraints are represented by 40u, + 45u, +60u, < 100, where w,,u2,u3 are non- 
negative integers. 
One first tabulates the functions g,(u,)(i = 1, 2, 3), where the conditionO < u, < z/w,(i = 1, 2, 3) 
must be observed with each w, an integer. 


| &2(u2) 


(1) - (2) (3) 
Using the formula /,(z) = max 20u,, with the aid of 


My 


step (1), one calculates the adjacent table, in which 4, (z) 
denotes that value of u, for which f,(z) is obtained. 
For m= 2 one obtains f(z) = max {g>(u2) +/,(z— u2w>2)} 


iy 
and by means of Tables(2) and (4) one arrives at the 
Table (5) where #,(z) is that value of w, for which 
f,{z) is obtained. For »=3 one has /; = max {g3(w,) + f2(z — w3u3)} and on the basis 
of this relation one arrives at the final Table (6). “s 


For z = 100 Table (6) shows that #4; = 0. Then #2 = a,(z — fi3w3) = i2(100) = 2, by Table (5). 
From Table (4) one obtains #, = &,(z — f2w, — is3ws) = &,(100 — 2- 45) = #,(10) = 0. Thus 
a aor solution (i; , #2, 4) = (0, 2, 0) is here unique. The optimal value of the load is then 

Ds = . 

Method of functional equations. For this method, too, a description of a suitable example will 
first be presented. 

An amount x of money is available and there are two possibilities for investing this money. 
The amount u,(0 < u, < x) is invested in the first way and the amount x — ux, in the second. In 
a given time interval, for example, in one year, the gain g,(u,) is expected from the investment u,, 
and the gain g2(x — u,) from the investment x — u,. At the end of the time interval the means that 
had to be employed to achieve the gains g, and g2 will have lost some of their effectiveness, by 
amortization and the need for maintenance, so that after one year the state will be 

x, =au+Wx—u), O<a<l, 0<b<1. 
Starting with the amount x, at the beginning of the second year, the sum uz with O <u, < x, 
will again be invested in the first way and the sum x, — uz in the second, so that the gain in two 
years is equal to 

81(Uy) + &2(x — uy) + 8y(U2) + g2(%1 — U2). 
In the same way, money will also be invested at the beginning of the third year, when the money 
disposed of will be x. = au, + b(x,; — uz). After m years the gain achieved is 


E {ex(u) + g2(Xi-1 — 4i)} =2 &(Xi_1, Ui) 


with xo = x, where at the end of the mth year the sum available for disposal is x, = au, + b(%,_ 1 — Uy). 
In this way a definite n-step process is described with the objective function 


30.3. Dynamic optimization 667 


~ (X11, 4u) = ~ (X11, Ui) = 2 {@1(ui) + 82(xi-1, Ui} 


with x9 = x and with the constraints 0 < u,; < x,_; (i= 1,...,”), where the state transformation 
x, = T'(x%1_1, 41) = Ti-1, ui) = au, + Oi_1 — 4), T= 1, ..., 4, 

does not depend on the step. If one chooses the inverse numbering of the state- and decision quan- 

tities, that is, if x = x,,, is regarded as the initial state, then one arrives at the following optimization 

problem: to determine the maximum of the objective function 


n n 
& 8X41 ,uj;) = DY {g,(u;) + 82(*i41 — 4;)} with the constraints O<u,< x41 @=1,...,”), 


x; = T!(Xi41 Ui) = T(xXi41, Ui) = ay + O(Xi414 — Ui), C= 1,..., 0. 
According to the optimality principle one obtains 
C9) — fin(Xn41) — max {g1(Un) = is 82(Xn41 — Un) + fn—1(Xn)} 


0<un <xXn+1 
= max {g1(Un) + g2(x — Un) = fn—-1TXn41 ’ Un))} 
O<unxx 
= max {g,(u) + g2(x — u) + fr_s(au + b(x — u))}, n> 1, where fo is defined by fo = 0. 
O<u<xx 
For n = 1,2,... this system represents a system of functional equations for the unknown func- 
tions f,(x), -.., f,(x). To solve the problem in a specific case, that is, when g,, g2 and the numbers 
a, b are prescribed, one determines recursively the solution of the last system of equations, that is, 
for each n = 1,2,..., the value a, = “,(x) for which 


&1 (Un) a6 &2(x = Hy) fn—1(au, ae b(x = Un)) = f(x) 
with respect to ue [0, x] and xX; = X,(x) = aa, + b(Xi41 — 4i), i= 1, ..., n. This is known as the 
method of functional equations. 


Example 6: Let the gain functions of the introductory example be g,(u) = « |/u, g2(x — u) 
= B y(x — u), where « > 0, 8 > 0 are arbitrary numbers and 0 < a = 6 < 1. Then in this case 
x, = au, + b(xig, — uy) = aX,4,, i= 1,...,m, and since x,,,; = x, it follows that x, = a"**~'x 
(i= 1,..., 2). The equations for f, lead in this case to the relations 


f(x) = max {a fut ByV(x—u)+f,-1(ax), n2>l, fo=9. 
OfLuax 


If for a fixed choice of x > 0 one defines the function ¢, .(u) = « Vu + B V(x — u) + f,_1(@x), 
then or = «/(2 fu) — B/[2 y(x — u)) for ue (0, x) and $m, x = 0 for the single point 
0 < fin(x) = [x?/(ax? + B?))- x < x, at which the maximum of @,,,(u) with respect to u € (0, x] 
is reached. Therefore f,(x) = « Va + B V(x — @) + fy_s(ax) = V(a? + B?) Vx + fa_s(ax) for 
n > 1 with f(x) = 0. 
From this it follows easily that: 
f(x) = V(a? + 8?) x], 3 
F(x) = VU? + B?) x] + Vo? + B?) ax] = Vi(a? + B?) x] (1 + a"), 


fax) = V(x? + B?) x] (1 + a¥/? + --- + ait-20/2), 
where /,(x) denotes the gain in the nth step. 
Because 0 < u, = x,,,; (i= 1,..., 7), 
(X41) = [07 (x? + B?)) x44 = [«7(a? + B*)Ja"-'x (i=1,..., 2). 
Since the reverse numbering has been used, the sequence [x?/(«? +- B?)] x, [x?/(a? + B?)] ax, ..-, 
[x?/(x? + B?)]a"-'x is the optimal strategy of the n-step process considered with the gain 


so | 
SAX) = V((a? + B) x] x a*/?, where the amount x, = a"x is available after the nth step. 
rf 


Future developments. In contrast to the loading problem, the maximization of gain represents 
an n-step discrete process for which the permissible region of the decision quantities forms a con- 
nected compact set — in this case a closed interval. This is therefore a special case of those discrete 
n-step processes for which the region of the decision quantities is represented generally by a closed 
and bounded region of space of corresponding dimension. In such a simple case the method of 
functional equations can often be applied with success. In more complicated cases, especially when 
the decision at every step is characterized by a decision vector u = (u,,..-,Um), m > 2, further 
methods, such as the method of multipliers or the method of successive approximations, are used with 
the aim of reducing the original problem to finitely many simpler problems for which the store 
of a computer is adequate. The discrete deterministic problems form only a part of the decision 
problems that occur in dynamic optimization. It is sometimes advantageous to consider processes 


668 30. Mathematical optimization 


with infinitely many steps, although such a process never occurs in practice. For if a discrete process 
with very many steps is given and the passage to the limit n — oo in the f,-equations is possible, 
then instead of these relations one obtains a single functional equation 


f(x) = lim f,(*) = max {g(x, u) + fr_i(T (x, 4))}. 
n~-» Co Omu<x 


In general, this functional equation is easier to solve than the original problem and for large n it 
gives a good approximation to the required solution. The theory of stationary processes examines 
under what conditions this method is applicable. 

Another class of optimization problems in dynamic optimization deals with decision processes 
in which, at every instant of time in the given time interval, a decision is possible and even required. 
One then speaks of continuous decision processes. The corresponding theory is closely linked with 
variational calculus and with the theory of optimal processes according to PONTRYAGIN. The mathe- 
matical requirements are quite demanding. 

In contrast to the discrete deterministic processes are the discrete stochastic processes, for which 
the state at the end of a step is known only in the form of a probability distribution. These processes 
are often closer to the multifarious problems in economics than are deterministic models, and the 
methods of dynamic optimization have also been extended in this direction. 


Ill. Brief reports on selected topics 


31. Number theory 


The integers 0.0... ce eens 670 Transcendental numbers ...........00004. 675 
Algebraic numbers .........0 ccc c wees 673 


The original task of number theory was the investigation of the properties of the integers. Its 
systematic development as a branch of mathematics came rather late. Individual results were known 
in antiquity, for example to EUCLID (about 300 B.C.) and DIOPHANTos (about 250 A.D.). In the 
17th century remarkable discoveries of scientific significance occurred, above all, in the investigations 
of Pierre FERMAT (1601-1666). Great steps forward were taken in the many works of Leonhard EULER 
(1707-1783), which are full of fruitful far-reaching ideas. At last Carl Friedrich GAuss (1777-1855) 
set up a uniform theory. In 1801 he published his Disquisitiones arithmeticae, a monumental work, 
which was the foundation of higher arithmetic in the strict sense. 

Nowadays number theory is a widely ramified theory, making extensive use of both abstract 
algebra (mainly in algebraic number theory) and of profound methods of analysis (in analytic number 
theory). This leads to problems and to new branches that only have indirect connections with the 
integers. 

In contrast to other parts of mathematics, many questions and results of number theory are 
comprehensible to the mathematical layman without specialized knowledge. But it turns out that 
proofs of theorems frequently require an extensive mathematical apparatus. 

Gauss called mathematics the queen of the sciences, and in 1808 (in a letter to his friend BOLYAI) 
he said: ‘It is remarkable that all those who study this science seriously develop a kind of passion 
for it’. 


Rings and fields. The basic facts of elementary number theory have been treated in Chapter 1. 
From the theory of divisibility it is known that the quotient of two integers may, but need not 
be, an integer; for example, 15/3 = 5, but 15/7 is not an integer. One says: division, the inverse 
operation to multiplication, cannot always be performed within the domain of integers. Number 
systems in which the operations of addition, subtraction and multiplication can be performed 
without restriction are called rings. If division (except by 0) can also always be carried out ina 
number system, one speaks of a field; for example, the rational numbers form a field. In what 
follows, the ring of integers 0, +1, +2, ... is denoted by Z and the field of rational numbers by Q. 
If a and b are two numbers in Z and b + 0, the quotient a/b lies in Q, but not necessarily in Z. If 
the latter happens, then a is said to be divisible by b or a multiple of b. 


Ideals. Apart from Z one needs rings R whose elements may be real or complex numbers. Of 
great importance are subsets J of a ring R that 


have the following properties: “6 “3 0 3 _S§ 
(i) if a and 6 are numbers in /, then so is a — 5; 7 584 271 1 2 4 5 7 
(ii) for every number r in R and every number a 31-1 The ideal (3) on the number line 


in J the product ra also lies in J. 
These subsets J of R are called ideals in R. 


For example, if m is a natural number, the totality of numbers 0, --m, +2m, +3m, ... is an ideal 
in Z (Fig.). Clearly, the difference of two integral multiples of m is again a multiple of m (Property (i)) 
and every multiple of a multiple of m is itself a multiple of m (Property (ii)). In such a case one denotes 
the ideal by M = (yn) to indicate that it consists of all the multiples of the number m. Ideals consisting 
of all the multiples of a single element of a ring R, in the present case m, are called principal ideals. 
It is easy to verify that in Z every ideal is principal, so that all ideals (7m) in Z are obtained by setting, 
in turn, m = 0, 1, 2, ... 

A concept of divisibility can also be defined for ideals. One says that an ideal A is divisible by 
an ideal B if every element of A is also an element of B, in other words, if A < Bin the set-theoretical 
sense. Apparently the naive sense of the word ‘divisible’ is turned into its opposite, but the connec- 
tion with the theory of divisibility immediately clarifies the reason for the nomenclature. An ap- 


670 31. Number theory 


plication of the definition to two ideals A = (a) and B= (b) in Z shows that A is divisible by B 
if and only if a is divisible by b. For example, the ideal (2) consists of the numbers 0, +2, +4, +6, 
+8, ... and the ideal (4) of the numbers 0, +4, +8, +12, ... Hence (4) € (2) set-theoretically; but 
this means that the ideal (4) is divisible by the ideal (2), because 4 is divisible by 2. Next, one defines 
a product AB of two ideals A and B, namely as the ideal consisting of all sums of finitely many 
products ab, where now a and b are arbitrary elements of A and B, respectively. In Z it follows that 
for A = (a) and B = (b) the product AB = (ab), for example (2) - (4) = (8). 

The concept of an ideal was created in the development of algebraic number theory. Ideal theory 
studies the structure of rings and their ideals. In order to simplify the statement of results it is 
convenient to use the following mode of speaking: let a and b be numbers of a ring R and J an 
ideal in R; one says that a = b (mod J) (read: a congruent to b modulo J) if a — b is a number in J. 
This relation of congruence is an equivalence relation, being transitive, symmetric and reflexive. 
On the basis of these properties all the numbers of R can be divided into disjoint residue classes 
mod J, in such a way that all the numbers congruent mod J belong to one and the same class. The 
significance of the formal way of writing lies in the fact that most calculating rules for equations 
also hold for congruences with respect to a fixed modulus. 

Historically, the concept of congruence was first created by Gauss for the ring Z. Here a= b 
(mod m) means that the difference of the two integers a and 5b lies in the ideal M = (mm), so that 
a — bis divisible by m, or that a and b have the same remainder on division by m. 


Examples: 88 = —10(mod 14) because 88 — (—10) = 98 is divisible by 14; 3’ = 1(mod 1093); 
237 = —1(mod 641). 


The integers 


Certain properties of the residue classes in the ring of integers Z can be studied in relation to 
divisibility of numbers. Let (a, b) denote the greatest common divisor (gcd) of two numbers a and b 
and let p denote a prime number. 


The residue class ring mod m. The residue classes mod m can be made into a ring by defining 
an addition and a multiplication of residue classes. This will be illustrated by an example. Let r, 
be the residue class 2 mod 6 consisting of the numbers ... — 10, —4, 2, 8, 14, ..., let r. be the residue 
class 5 mod 6 consisting of the numbers ... —7, —1, 5, 11, ... Then r; + r, is defined as the residue 
class which contains 2 + 5 = 7, hence also, in fact, all ‘the numbers that on division by 6 leave 
the remainder 1. One writes 2+ 5 =1; other examples are 5+ 0 = 5, 3+ 3 = 0. The product 
is defined by 2-5 = 10 = 4; other examples mod 6 are 3- 0=0, 3-3 at 2-3=0. 

Under the addition and multiplication just defined the residue classes mod m form the so-called 
residue class ring mod m. The structure of this ring can be described in precise statements. It is a 
field if and only if m is a prime number. 


The group of prime residue classes. By selecting from the m distinct residue classes 0, 1, 2, . 

m — 1 mod mall those whose numbers are prime to m one obtains the prime residue classes mod ? m. 
For m = 6 there are two prime residue classes 1 and 5; for m = p:p there are always p — 1, namely 
1,2,...,p — 1. 

The number of prime residue classes mod m is denoted by y(7m) (Euler’s function). For example, 
(6) = 2, y(p) = p — 1; gm) is a number-theoretical function, that is, a function defined for 
integer arguments. It is a multiplicative function, that is, p(a, b) = ¢(a) y(b), provided that (a, b) = 1. 
It is easy to count out that y(p*) = p* — p*-! = p*-1(p — 1). The rule just stated makes it possible 
to calculate g(m) for every m; for example, ¢(3240) = (23) 9(3*) (5) = 4° 54-4 = 864. 

Under the operation of multiplication the prime residue classes mod m form a group G,, of order 
g(m). The structure of G,, is important for all m, but here only the case m = p will be treated. G, 
is cyclic, which means that every prime residue class mod p can be written as a power of a fixed 
residue class g; such a g is called a primitive root mod p. For example, for p = 11 the residue class 
group Gi can be generated by g = 2 (or by 6, 7, 8) for the powers 2° = 1, 2! = 2, 27= 4,27 =8, 
24 = 5, 25= 10, 2°=9, 27=7, 28 = 3, 29 = 6 (mod 11) gives all the p — 1 = 10 prime residue 
classes 1, 2, ..., 10. Since every element a ‘of a finite group G of order 7 satisfies the equality a" = e 
¢ the unit element), one has for G = G,, Euler’s theorem: 


Euler's theorem. a?” = I(mod m) when (a,m) = 1. In the special case m= p this becomes 
Fermat’s theorem: a°~? = 1(mod p) when a is not divisible by p. 


Congruences with unknowns. In the residue class ring and residue class group certain algebraic 
problems can be solved. For example, one can ask what residue classes X mod mm satisfy given 
equations, say 4x = b. Such questions lead to congruences mod m with unknowns. The linear 
congruence ax = b(mod m) cannot always be solved, for example 3x = 2(mod 12) is insoluble 
because no integer multiple of 3 leaves the remainder 2 on division by 12. In fact, ax = b(mod m) 


The integers 671 


is soluble if and only if 5 is divisible by the greatest common divisor (a, m). A congruence of degree n, 
x" tax") +... + an = O(mod m) can have more incongruent solutions than its degree 7 in- 
dicates, for example x? = 1(mod 8) has the solutions x = 1, 3, 5, 7. If the modulus m is a prime 
number Pp, it need not have solutions at all and cannot have more than 7 residue classes as solutions. 

Power residues. In the binomial congruence x" = a(mod p) is a not divisible by p. Those residue 
classes a(mod p) for which this congruence is soluble are called mth power residues mod p. Two 
fundamental questions arise: 


1. What numbers are nth power residues for a given prime? 
2. For what primes p is a given number a a nth power residue? 


The first question is answered by Euler’s criterion a‘°-')/4 = 1(mod p) where d = (p — 1, n). 
Those and only those residue classes a that satisfy this condition are mth power residues. The answer 
to the second question leads to reciprocity laws, which are among the most beautiful and profound 
results of number theory. For m = 2 the residue classes a(mod p) with (a, p) = 1 for which the 
congruence x? = a(mod p)is soluble are called (quadratic) residues and those for which the congruence 
is insoluble (quadratic) non-residues. For odd p there are equally many residues and non-residues 
mod p, namely (p — 1)/2 each. For example, mod 17 one has 17 = 1, 27 = 4, 3? =9, 47 = 16, 
52 = 8, 62 = 2, 77 = 15, 8? = 13. Hence 1, 2, 4, 8, 9, 13, 15, 16 are the eight residues, and 3, 5, 6, 
7, 10, 11, 12, 14 the eight non-residues mod 17. 


The law of reciprocity. The investigation of the question for what moduli p a given number a 
is quadratic residue has led to the discovery of the celebrated quadratic reciprocity law, which was 
set up by EULER on the basis of extensive numerical material. Let p and 4 be odd prime 
numbers. If at least one of them is of the form 4k + 1, then the congruences x? = p(mod gq) and 
y* = q(mod p) are both soluble or both unsoluble (in other words, p is residue mod q if and only 
if g is residue mod p). But if both p and gq are of the form 4k + 3, then ‘only one of the two congruences 
is soluble, the other insoluble. The first complete proof of the law was found by Gauss when 
he was 18 years old. Later he gave six further proofs, of what he called Theorema fundamentale, 
and altogether more than fifty different proofs have been found since. The diverse principles employed 
and the endeavour to find reciprocity laws for higher power residues have given a powerful impetus 
to number theory. 


About ten years befor Gauss’s proof, a proof of the quadratic reciprocity law was published 
by LEGENDRE (1752-1833), but it contained a gap. Legendre introduced a useful symbol 


(the Legendre symbol), which has the value +1, —1, or 0 according as a is residue (mod p), 
non-residue (mod p), or divisible by p. This makes it possible to express the law in a formula: 
p-1 q-1l1 


(2): (2) 


Supplements to the law of reciprocity are the propositions 
p-1 p?-1 


(31) — 0 a (2) 


The first of these states that the congruences x? = —1(mod p) is soluble when p is of the form 
4k + 1, hence (p — 1)/2 is even, and it is unsoluble when p has the form 4k + 3, hence (p — 1)/2 
is odd. 


Diophantine equations. Let f(x; , x2, ...,x,) be a polynomial in x,, x2, ..., x, with integer coef- 
ficients. An equation f(x, , X2, ---, X,) = 5 is called Diophantine if a solution is required to consist 
of integers. This kind of equation is named after the Greek mathematician DIOPHANTOs of Alexandria. 
A system of Diophantine equations f; = b,, f2 = b2 +> x = 5b is equivalent to the 
single equation (ff, — 5,)? + (2 — 62)? +: +(4— 5)? =0, that is, the solution sets in 
Z of the system and of the single equation are the same. A linear Diophantine equation 
A,X, + AgX2 +++) + AqXy = 5b (Ay, a2, ..., a,, 6 integers) 
is soluble if and only if b is divisible by the gcd (a,, a2, -.-, ay). 

If it is soluble at all, then there are infinitely many solving n-tuples. The case n = 2 is easy to 
survey. If a, and a2 are coprime and x}, x3 is any solution of a1X1 + a,x, = b, then the totality 
of solutions can be represented 1 in the form x; = x1 + G2t,X2 = x2 - at, where ¢ is a arbitrary in- 
teger. A special solving pair can be obtained from the last but one approximating fraction for the con- 
tinued fraction expression of a,/a,. 


Example: 43x, + 19x, = 6. The approximating fractions (see Chapter 3.6. — Continued 
fractions) of 43/19 are 7/3, 9/4, 43/19. The fraction 9/4 yields x, = 46, x, =—9b, so that the 
general solution can be written in the form x, = 46 + 191, x, = —9b — 431. 


672 31. Number theory 


For n > 3 more general eenunued fraction methods have been developed. If a linear system of 
m independent equations is given, Pa a, ;X; = b,, i= 1, 2,..., m, and if m > n, the system is over- 


determined and does not have éslationss in general. But if m < n, the system is underdetermined 
and has, in general, infinitely many solutions. More accurately, in the last case the system is soluble 
in integers x,, X2,.--,X, if and only if the greatest common divisor of the m-rowed determinants 
of the coefficient matrix (a,;) is equal to the gcd of the m-rowed determinants of the augmented 
coefficient matrix, which arises from (a,,;) by adding the column of the 5;. 

The problem of solving the most general Diophantine equation of the second degree in two un- 
knowns x,, X2, 


CX? + 042% X2 + C22XF + €13X1 + €23X2 + €33 = 0, 


with integers c;;can be reduced by a transformation of variables to the same problem for an equation 
of the special form y? — Dy2 = 5 with integers D and b. Two cases are to be distinguished. If 
D <0, then there are no solutions or finitely many solutions y,, y,. If D > 0 (D not a square) 
and b = 1, then the so-called Pell’s equation y? — Dy3 = 1 has, apart from the trivial solutions 
y; = +1, y2 = 0, infinitely many solutions, which can be obtained from a minimal solution. It 
is easy to see that the more general equation y? — Dy2 = 6 cannot have solutions at all if 5 is 
quadratic non-residue (mod D). 


Examples: 1. x? + x\x2 + x3 = 19 has exactly 12 solution pairs 2, 3; —2, —3; 3, 2; —3, —2; 
2. yj — Sy} = 1. From the minimal solution y; = 9, y; = 4 one obtains for n = 0, 1, 2, ... 
the totality of solutions y, = +[(9+4)/5)"+(9—4)5)"]/2, ys =+[19 + 4y5)"—9—4Y/5)"/(2y5). 


The problem of finding all right-angled triangles for which the adjacent sides x, y of the right 
angle and the Pe a z are integral multiples of the unit of length leads to the Diophantine 
equation x? + y? . Its solutions, the Pythagorean numbers, can be represented by the formulae 
x=uw—v,y= Quo, z = u? + v*, where uw and v are arbitrary integers. The smallest solutions 
are 37 + a ~~ 52 and 52 + 12? = 132, 

There is a surprising difference between the cases n = 2 and n > 3. A theorem of Thue states 
that an equation a,x" + a,x" 'y4+--+a,y"=b (n>3, ay + 0) with integers @,,43,..., a, 
hds only finitely many solutions unless the left-hand side can be split into homogeneous factors of 
lower degree with integer coefficients. Diophantine equations of higher degree lead to deep prob- 
lems whose solution requires a knowledge of the theory of algebraic number fields. 

Among them Fermat’s conjecture (also called Fermat’s last theorem) has met with particular 
interest: for any integer exponent n > 2 there are no non-zero integers x, y, z satisfying the equation 
x" + y" = z". FERMAT had written (about 1637) in the margin of his own copy of the works of 
DIOPHANTOS: ‘I have discovered a truly wonderful proof, but this margin is too small to contain 
it’. His proof has never been found, and in spite of the efforts of famous mathematicians no one 
yet has succeeded in proving or disproving the conjecture. Interesting partial results have been ob- 
tained; for example, it is known that the conjecture is correct for all exponents 7 up to 125,000. 


Analytic number theory. Apart from Euler’s function 
g(n), many other number-theoretical functions f(m), with 
the set of natural numbers as domain of definition, have 
been investigated. 

Examples are: 2(x), the number of primes < x, d(n), 
the number of divisors (Fig.), o(7), the sum of the positive 
divisors, of 7; r(m), the number of integer solution pairs x, y 
of the equation x? + y? =n. Some of these functions show 
| a very erratic behaviour, for example, d(1) = 1 

ee ees rs d(2) = 2, d(3) = 2, d(4) = 3, d(5) = 2, d(6) = 4, d(7) = 2, 
ee d(8) = 4 etc. Nevertheless, frequently a regularity can 


: ee Sota ade cae tcp sen d("): be found for the average of the first n function values 
otal number of positive divisors of n of f. The function 


Lf) + £2) + ++ +f@)I]/n 
in many cases behaves asymptotically for increasing m like an analytic function of m. For example, 
(I/n) a d(k) grows like the natural logarithm of n. Setting [d(1)+ d(2)+ ---+d(n)]/n=1n n+ R(x) 


one finds that R(m) is of smaller order of magnitude in nm than In n is. The investigation of averages 
of number-theoretical functions and, above all, of their remainders R(7), requires the finest tools 
of analysis. This branch of mathematics, analytic number theory, has occupied the minds of famous 
mathematicians, beginning with EULER in the 18th century and going right to the present 


Algebraic numbers 673 


day. It is still in a state of flux. Particular attention has been paid to the function a(x), the number 
of primes up to and including x. 
GAuss conjectured the asymptotic law 2(x) ~ x/In x, whose validity was not proved until 
about 1900 by HADAMARD and de la VALLEE-PoussIN. Their investigations showed that the best 
x 


: ; : : : : dt : 
approximation to z(x) is the integral logarithm li (x) = itn and that the estimate of the 


0 
remainder R(x) = x(x) — li (x) is closely connected with the position of the complex zeros of the 
Riemann zeta-funktion C(s) = J (1/n’) with s =o-+ it ando > 1. 

Among the numerous other distribution problems there is the celebrated theorem, first proved 
by Lejeune DIRICHLET (1805-1859): Every arithmetic progression in which the initial term and the 
common difference are coprime contains infinitely many primes. 


Additive number theory. The questions of additive number theory will be illustrated by a few 
special theorems and problems. 


Fermat's theorem: Every prime number p with p= 1(mod 4) is the sum of the squares of two 
natural numbers. The representation is unique apart from the order of the summands. 


Example; 233 = 8? + 13. 


Lagrange’s theorem: Every natural number can be represented as the sum of the squares of at 
most four natural numbers. 


Example: 11 can be written as the sum of three squares: 11 = 3* + 1* + 17, 7 requires four 
squares: 7 = 27 + 1? + 1* + 1?, 


Waring’s problem (raised in 1770 by WARING, first solved in 1909 by HILBERT). There is a 
number-theoretical function g(k) such that every natural number can be represented as a sum 
of at most g(k) kth powers of natural numbers. For example, g(2) = 4, by the theorem of 
Lagrange, and g(3) = 9. It is interesting that for k > 3 every sufficiently large n requires fewer 
than g(k) summands. It is known that 239 = 43 + 43 + 33 4+ 334+ 33 4 334 134 13+ 13 is 
the largest number that requires 9 cubes; all larger numbers require at most 8 cubes, and it has been 
proved that from a certain natural number onwards all subsequent numbers admit a decomposition 
into at most 7 cubes. 

Goldbach’s conjecture. In a letter to EULER in 1742, GOLDBACH conjectured that every even 
number n > 6 is the sum of two odd primes. This conjecture has so far neither been proved nor 
refuted. The best result in this direction, due to VINOGRADOV (born 1891) is the three primes 
theorem: Every sufficiently large odd integer is a sum of three odd primes. The proof uses very 
subtle analytic methods. 

Partitions. By a partition of a natural number n one understands a representation of n as a sum 
of natural numbers. The total number of partitions of n is denoted by p(n), where the number of 
terms is not restricted, equal summands are admitted, and the order of the summands is ignored. 
Thus 5=—=4+1=342=3+4+14+1=24+24+1=24141+4+1=14+141+4+1-¢1 
are the seven partitions of 5, so that p(5) = 7. The function p(n) has many interesting properties, 
for example, p(Sn + 4) = O(mod 5) and p(n) ~ [1/(4n /3)] exp [z //(2n/3)] for large n. More gener- 
ally, one of the fundamental questions of additive number theory can be stated as follows. Let A 
be a set of natural numbers and s > 2 a given natural number. Can every natural number be represent- 
ed as a sum of at most s elements of A? — By new methods and concepts (density and order of A) 
such problems have been tackled successfully from 1930 onwards. 


Algebraic numbers 


Algebraic number fields. An algebraic number « is a complex number that is a root of an algebraic 
equation f(x) = 0. Here f(x) = ayx™ + --- + dG, is a polynomial over the field Q of rational numbers, 
with do + 0, m > 1, in other words, the coefficients ag, a;, ..., 4, are rational. The number « is 
a root of infinitely many equations of various degrees; for example, « = )3 satisfies the equations 
x? — 3 =0, x? — x? — 3x + 3 = 0, x* — 9 = Oetc. But the polynomials for the last two equations 
can be factored into polynomial of lower degree: 


x3 — x? — 3x +3 = (x? — 3)(x—1) and x*—9= (x? — 3) (x? + 3). 


Such polynomials are called reducible. If it is impossible to factor a polynomial f over Q into non- 
constant factors of lower degree, again with coefficients in Q, then fis called irreducible over Q; 
for example f(x) = x? — 3 is irreducible. 

Now for an algebraic number « there is exactly one rationally irreducible polynomial g(x) with 
leading coefficient 1 such that g(a) = 0. The degree of the algebraic number « is defined as the degree 


674 31. Number theory 


of g(x). For example, every rational number r is algebraic of the first degree, being the root of 


x—r=0; (1 +i /3)/2 is of degree 2, being a root of x? — x + 1=0; and 2 is of degree n, 
as a root of x" —2=0. 

The roots « 1), a«{), ..., a of g(x) = 0 (one of which is «) are called the conjugates of « and are 
all distinct. If in p(x) = 0 all the coefficients are integers, then « is called an algebraic integer. 

Let # be an algebraic number. The smallest field = Q(#) that contains Q and also @ is called 
an algebraic number field. For example, Q(/3) consists of all numbers of the form a + b V3, where 
a and b are rational numbers. It is easy to check that these numbers do, in fact, form a field; for 
example, 1/(a + b 3) = (a — b Y3)((a* — 3b?) = A+ BY3. The degree of k is defined as the 
(uniquely determined) degree n of any number @ for which k = Q(#). 


Quadratic number fields. Among the fields of special type the quadratic number fields have been 
studied most thoroughly. Taking # = Vd one may assume that d is an integer not divisible by a 
square. Under this condition one distinguishes between quadratic number fields Q(/d) with 
d= 1(mod 4) and with d= 2 or 3(mod 4). In the former case an integral basis of the field is given 
by w, = 1, w, = (—1 + Jad)/2, in the latter by w, = 1, w2 = Vd. By this one means that every 
algebraic integer of Q(//d) can be represented uniquely in the form c,w, + c2@2 with rational in- 
tegers c, and c2. 


Cyclotomic fields. For every natural number 7 the 7 roots of the equation z” — 1 = 0, a so- 
called cyclotomic equation, are called the nth roots of unity, because when represented in the complex 
plane they divide the unit circle around the origin into n equal parts. The » — 1 roots other than 
z = | satisfy for n > 2 the equation f(z) = 0, where 

f@) = (@" — Di@— DH 2424 224-4241. 

When z is a prime number p, then the polynomial f(z) is irreducible over Q. It is easy to see that 
in this case the numbers @, w?, ...,@?-1 are all the roots of f(z) = 0, where w is any one root. The 
field Q(w) is called a cyclomatic field. Its conjugates Q(w), Q(w7), ..., Q(w?-*), all of degree p — 1, 
coincide among each other. For a general n the cyclotomic fields were studied intensively by 
KUMMER in connection with his investigation on Fermat’s last theorem. His results were 
pioneering in the development of algebraic number theory. Even before him Gauss had 
thought out a method of solving cyclotomic equations. His theory also made it possible to 
indicate all the regular n-gons that can be constructed by ruler and compass. Thus, in the theory 
of cyclotomy (Greek: division of the circle) three domains of mathematics, namely geometry, 
algebra, and number theory, are in close interaction in a wonderful manner. 

Units. Algebraic integers of a special kind in an algebraic number field are the units ¢, which 
divide the number 1. For them the reciprocal e~! is also an algebraic integer. In Q the only units 
are +1. But as a rule, an algebraic number field contains infinitely many units. They can all be 
derived from finitely many among them (fundamental units) by multiplication and exponentiation 
(Dirichlet’s unit theorem). Besides Q the imaginary quadratic number fields (for which the generating 


number @ is complex) have finitely many units. The study of algebraic number fields has led to 
results of great interest. 


Ideal theory. One could be tempted to assume on the basis of the laws of arithmetic in Q and Z 
that every algebraic integer can be split into a product of prime factors, which themselves are 
algebraic integers, uniquely apart from the order of the factors and from unit factors. But it was 
recognized in the 19th century that this assumption is false. For example, in the quadratic number 
field Q(/—5) the number 6 has two distinct decompositions 6 = 2-3 = (1 + ~—5)(1 — Y—S). 
It must and can be shown that, apart from unit factors, the numbers 2, 3, 1 + 5, 1 — /—S5 cannot 
be factored further in Q(V—5). 

This appeared to indicate that no simple arithmetic theory of the algebraic integers could be 
possible. But then KUMMER found a way, which was later developed independently by KRONECKER 
and DEDEKIND. DEDEKIND created the theory of ideals. He replaced the algebraic integers by the 
ideals of the ring R of algebraic integers in an algebraic number field k and proved the main 
theorem: 


Every ideal in R, other than R itself and (0), can be represented as a product of prime ideals, uniquely 
apart from the order of the factors. 


For the ring Z of rational integers this means that every principal ideal (m) is uniquely, apart 
from the order of the factors, a product of prime ideals (p;) (p2) -:- (p,). But this is only another 
way of stating the fundamental theorem of elementary number theory m = +p; P2 --- Pn. 

Ideal classes. Two ideals A and B of the ring R in k are said to be equivalent if there are two 
principal ideals («) and (8) such that («) A = (8) B. On the basis of this concept of equivalence 
one can split all the ideals in R into disjoint classes, and it turns out that the number A of resulting 
classes is finite. The principal ideals form a class by themselves. In Z this is the only class, so that 


Transcendental numbers 675 


here h = 1. The determination of h is a difficult task, but can be achieved with the help of analysis. 
There is a transcendental formula for the class number, due to DIRICHLET. For special types of 
field an arithmetic representation of A is known. 


Transcendental numbers 


A number y that is not algebraic is called transcendental. It satisfies no algebraic equation with 


integer coefficients. It is not immediately clear that transcendental numbers exist. LIOUVILLE 
[oe] 


was the first to construct some explicitly, for example, > (1/2"!). His method was based on theo- 
n=l 

rems which state that algebraic numbers cannot be approximated ‘arbitrarily well’ by rationals. 

He proved, for example, that if « is an algebraic number of degree n, then a positive real 

constant c can be found such that |« — r/s| > c-s-" for all rational numbers r, s (s > 0). The fol- 

lowing theorem of Thue-Siegel-Roth is very deep: the inequality |« — r/s| << s~“, where u > 2, 

o« is any algebraic number, and r, s are rational integers (s > 0) has only finitely many solutions r, s. 

It is of great interest to know whether a particular number or a value of a given analytic function 
is transcendental or not. HERMITE was the first to prove (in 1873) that the basis e of the natural 
logarithms is transcendental. Shortly afterwards LINDEMANN succeeded in proving (in 1882) that 
the area z of the unit circle is also a transcendental number. This showed that the quadrature of 
the circle is impossible, namely to draw with ruler and compass a square whose area is equal to 
that of a circle of given radius. 

More generally, if « + 0, then « and e* cannot both be algebraic. Consequently, the functions e* 
(for x + 0) and In x(x + 0,1) have transcendental values for algebraic arguments x. This result 
is proved with the aid of complex analysis, which is also used in showing, for example, that e” is 
transcendental. It is still not known whether e* or e + 2 or Euler’s constant y are transcendental. 

Around the turn of the century David HILBERT (1862-1943) made a list of 23 important mathe- 
matical problems. Among many others, his seventh problem has been solved: «? is transcendental 
whenever « =+ 0, 1 is algebraic and # is algebraic and irrational. This shows that the quotient of 
the logarithm of two algebraic numbers is either rational or transcendental. 

Many transcendency statements refer to elliptic integrals and elliptic functions. For example, 
the perimeter of an ellipse for which the length of the axes are algebraic is transcendental. Questions 
of algebraic independence of transcendental numbers play an important role in the theory of trans- 
cendental numbers. Among them is the theorem of LINDEMANN. 


If 4, %, «ss, %, are algebraic numbers that are linearly independent over the field Q of rational 
numbers, then there exists no relation B,e™ +- By,e*+ 4- --- + Bye*" = 0 with algebraic coefficients 
By, B2, ---, By, unless they all vanish. 


32. Algebraic geometry 


Algebraic geometry developed from the theory of algebraic curves and surfaces and the n-dimen- 
sional geometry of the Italian school. The first contributions to the theory of plane algebraic curves 
were made by Isaac NEWTON (1643-1727), Colin MACLAURIN (1698-1746), Leonhard EULER 
(1707-1783) and Gabriel CRAMER (1704-1752). The founder of algebraic geometry in the strict 
sense was Max NOoETHER (1844-1921). The Italian geometers, principally Corrado SEGRE (1863-1924), 
Francesco SEVERI (1879-1961) and Federigo ENRIQUES (1871-1946) brought this discipline to complete 
development. In this century an investigation of the foundations of the subject from the algebraic 
point of view was undertaken by the German school, particularly by Emmy NoeETHER (1882-1935), 
the daughter of Max Noether, Bartel L. VAN DER WAERDEN (b. 1903) and Wolfgang GROBNER (b. 1899). 


Algebraic curves and surfaces. The central concept of algebraic geometry is the algebraic variety 
(AV) in n-dimensional projective space S,, in which each point is given by the ratios x9 : X12 +°: i Xp 
of n + 1 coordinates. 

To explain this concept one first treats the case m = 2; then S>2 is a projective plane. A (plane) 
algebraic curve of S2 is defined by a homogeneous algebraic equation F(x9, x,, x2) = 0, that is, 
the curve is the totality of points (€),&,, 2) that satisfy the equation. For example, the equation 
of the parabola y? = 2p(x—a) can be made homogeneous: x2/x2 = 2p(x,/xo — a) or 
2apx% — 2pxox, + x3 = 0; on the left-hand side there is a homogeneous polynomial (form) 
F(xo, X1, X2). From two homogeneous algebraic equations Fi (x9, x;, X2) = Oand F3(xo, X1, X2) =0 
the common points of the curves F, = 0 and F, = 0 are obtained, namely all points (£0, £1, 2) 
that satisfy both equations. If F,(xo, x,, x2) and F2(%o, x;, x2) have no common factor, then there 


676 32. Algebraic geometry 


are only finitely many such points. Both cases (an algebraic curve; finitely many points) are examples 
of algebraic varieties in the projective plane. 

In the case n = 3, that is, in projective space S3, a form F(x9, x1, X2,X3) gives an algebraic 
surface; two forms F,(Xo, x1, 2,3), F2(Xo, 1, 2,3) with no common factor give a planar 
or spatial algebraic curve. Three or more forms F,, F2, F3,.-. with no common factor can still 
give a curve. The points (0, €;, 2, 3) of a curve of S3 satisfy other equations, apart from F; = 0 
(¢= 1, 2 or i= 1, 2, 3, ...); for example, GF; = 0, where G can be a constant or, more generally, 
an arbitrary form, and (if F; and F, have the same degree) F; + Fy = 0. 


Polynomial ideals, sets of zeros, and algebraic varieties. In order to survey all these equations 
it is convenient to use the concept of an ideal in a commutative ring R. 

Definition. A set of elements of a ring R is called an ideal a if, given two elements a and 6 of a, 
a — bis in a, and furthermore if a is any element of a, then every product ra is in a, where r is any 
element of the ring. A simple example is the set of all elements ra and ar, where a is a fixed element 
of R. An ideal of this kind is called a principal ideal and denoted by (a). The concept of ideal was 
first used historically in number theory, in connection with Fermat’s problem. In particular, if R 
is a ring of polynomials f, g, ..., then its ideals are called polynomial ideals. If one is considering 
only forms in R, and therefore in an ideal of R, then one has a homogeneous ideal (H-ideal). A point 
(€) = (€9,&1, ---5 &n) Of S, is called a zero of the H-ideal a if every form F(x) = F(xo, x1, ---, Xn) 
of a vanishes for x9 = 9, xX; = &1,.--, Xn = &n. The totality of all zeros of a is called the set of 
zeros (SZ) of the H-ideal (see the examples above). 

In what follows, further theorems on ideals in commutative rings R, which are important for 
algebraic geometry, are quotet. 

The intersection a ~ b of two ideals a and 6 of R is the totality of those elements a of R that belong 
to both a and 5b. The intersection of two ideals is again an ideal. The sum (a, 6) of a and 6b is the 
totality of all elements of the form a + b, where a € a, b € b. The sum of two ideals is again an ideal. 
The intersection SZ(a) ~ SZ(b) of two sets of zeros is the set of those points (6) that belong to both 
SZ(a) and SZ(b). The join or sum 


SZ(a) v SZ(b) = SZ(a) + SZ(b) 


is the set of all points that belong to at least one of the SZ. The following two theorems hold: 
(1) SZa@-b) = SZ(a) + SZ(6), (2) SZ(a, b) = SZ(a) ~ SZ(b). These definitions and theorems 
can be extended to the case of finitely many ideals. 

An ideal a is called reducible if it is the intersection of two ideals distinct from a; otherwise it is 
called irreducible. Correspondingly, SZ(a@) is called reducible or irreducible according as a is reducible 
or irreducible. 


Example: a = (x? — x3) = (x, + x2) 0 (x4, — x2); the SZ of @ in S, is the pair of lines 
xy t+ x2 = 0, x — xX, = 0, 


However, a polynomial ideal is not uniquely determined by its SZ. For example, the AH-ideals 
a = (x, + x2) and b = ((x; + x2)?) have the same SZ in S,, namely the points of the line x, + x. =0. 
Hence it is convenient to agree that an algebraic variety AV(a) is determined not by SZ(a) alone 
but by further data. One could characterize AV(a) by the H-ideal a itself. In this sense, in the example 
above AV(a) += AV(b). Now every polynomial ideal can be expressed as the intersection of finitely 
many irreducible ideals (the Lasker-Noether theorem): (3) a = q; ~q2 >::: ~q,. By definition, 
this gives the partition (4) AV(a) = AV(q;) + AV(q2) + --- + AV@,). It should be specially 
noted that the representation (3), and hence the partition (4), are not unique for all ideals a. 

For further investigation of irreducible ideals the following definitions are necessary. An ideal p 
of the ring R is called a prime ideal if ab € p and a¢p imply that bep. An ideal q of R is called 
a primary ideal if ab €q and a¢q, b¢q imply that a@ eq and b° € q, where @ and @ are suitable 
natural numbers. For polynomial ideals the following theorems hold: 


(5) Every prime ideal is irreducible. 

(6) Every irreducible ideal is primary, but not every primary ideal is irreducible. 

(7) For every primary ideal g there is exactly one prime ideal p such that SZ(¢g) = SZ(:). 

(8) The prime ideals p,,p;,-...,p, belonging to the irreducible components g,, q2, ---,g, in (3) 
are uniquely determined by a; they are called the prime ideals belonging to a. 


It follows (see (6), (7)) that to each SZ of an irreducible ideal g there is exactly one prime ideal 
p such that SZ(q) = SZ(p). Accordingly a prime ideal is uniquely determined by its SZ. Hence, 
by definition, AV(p) = SZ(p). 


Generic zeros. According to VAN DER WAERDEN every SZ(p) can be characterized in the following 
way. Apart from the so-called special points (§), whose coordinates are elements of an algebraic 
extension of the coefficient field K, one can also consider points (9) (&(t)) = (Eo(to, .--, ty), «++ 


Sets of zeros and algebraic varieties 677 


E,(to, ---, t,)), whose coordinates are elements of an algebraic extension of K(to, ..., t,), the field 
of rational functions in the indeterminates fo, ..., t, with coefficients in K. 
Definition: (&(t)) is called a generic zero of the ideal a or a generic point of SZ(a) if F(x) €a is 


necessary and sufficient for F(&(t)) = 0. For example, the circle (10) x? + x2 = xZ (in inhomogeneous 
coordinates (11) x? + y? = 1) has the generic point (12) xo = f%o(1 + 12), x1 = fo(1 — 1), 
x2 = 2tots . 


An ideal has exactly one generic zero if it is a prime ideal p. 


From (12) one can obtain the generic point for (11) by putting (13) x = x,/x9 = (1 — ¢?)/(1 + #3), 
y = X2/xXo = 24,/(1 + 12). Similarly, from (9) one can obtain the generic point in inhomogeneous 
coordinates by dividing all the coordinates by the first non-zero coordinate. The maximal number 
of algebraically independent non-homogeneous coordinates obtained in this way is called the dimen- 
sion of p or of SZ(p); s numbers u,,..., u, are called algebraically independent over the field K if 
f(u,, ---, Us) = 0, where f is a polynomial with coefficients in K, implies that all the coefficients 
vanish. From (9) one obtains all points of SZ(p) (with the possible exception of components of 
lower dimension) by substituting all possible special values for fg, ..., t, (specialization of parameters) 
in (&(t)). For example, the generic zero (12) of the circle (10) gives all its points except the point S 
for which x9: x1: x2 = 1: —1:0. Similarly, (13) gives all points of (11) except S(x, y) = (—1, 0). 

If in the Fig. 32-1 one takes all the lines (14) y = m(x + J), 
then each of them cuts the circle (11) in S and one further point 
P, whose coordinates (x, y) are uniquely determined by the gra- 
dient m. By substituting for y from (14) into (11), one obtains 
(1+ m?) x? + 2m?x — (1—m?) = 0. This quadratic equation has 
roots —1 and (1 — m?)/(1 + m?); in conjunction with (14) this 
gives the two points S(—1,0) and P((1 — m?)/(1 + m7), 
2m/(1 + m?)). All the points P obtained in this way are distinct 
from S, since the equation (1 — m?)/(1 + m?) = —1 is not satisfied 
by any m. This corresponds to the fact that the line x = —1 can- 
not be expressed in the form (14). Conversely, any point P(x, y) 
of the circle distinct from S corresponds to the value m= y/(x +- 1). 
If the gradient m is chosen as the parameter ¢,, then the coordi- 
nates of P take the form (13). In contrast to x = cost, y = sinf, 
this parametric representation of the circle is rational. 


Multiplicity. Instead of characterizing AV(a) by means of the 
ideal a, as above, an AV can be uniquely defined if the prime ; ; 
ideals p, belonging to a are given (see (8)) and a non-negative 32-! Rational parametric repre- 
number jig is associated with each of them as its multiplicity, Sentation of the unit circle 
This can be written symbolically as 


AV(@) = 41SZ(pi) + ++» + Ms SZ(p,). 


For example, the ideal b = ((x,; + x2)*) considered above determines an AV consisting of the line 
xX, + xX2 = 0 with multiplicity 2. The concept of multiplicity is particular important in the investiga- 
tion of the intersection of algebraic varieties. Two definitions of multiplicity have been given, which 
coincide for S, and S3; but can lead to different results in spaces of higher dimension. The first 
definition is based on the principle of conservation of number, postulated by the earlier geometers, 
by which the number of points of intersection in a special case is equal to the number in the general 
case. Anexact formulation of this principle, and the limitations on its applicability, were given by 
VAN DER WAERDEN in 1927 by means of the concept of relation-true specialization. On the other 
hand, the second ideal-theoretic, definition of multiplicity is given as the length of an ideal. If this 
definition is used, then in contrast to the fierst, the generalized Bezout theorem is no longer 
unrestricted; in other problems, however, the use of the length of an ideal is appropriate. 


Recent methods. In order to fix the concept of variety, Oscar ZARISKI (b. 1899) first used in 1938 
the method of valuations, in addition to the SZ and multiplicity; this in turn has connections with 
Krull’s theory of local rings (Wolfgang KRULL, 1899-1971), the theory of functions and set-theoretical 
topology. 

Later, in 1946, André WEIL (b. 1906) gave a new foundation of algebraic geometry by using 
topological methods (sheaf theory, cohomology theory), which are often referred to in recent works. 


678 33. Further algebraic structures 


33. Further algebraic structures 


PCOS: 5a bee ek bbs BAe eee 678 Representation theory ...... 0.00 ccc cece ee 680 
Rings and algebras ......... cc cece ewes 679 Conclusion 


The characteristic features of an algebraic structure have been discussed in detail in Chapter 16. 
and in Chapter 17., with reference to a vector space. Since this concept 1s of central importance for 
contemporary algebra, there now follows a brief account of further developments leading from 
groups and fields to lattices, to rings and algebras, and to representation theory. 


Lattices 


The concept of a lattice was formed with a view to generalize and unify certain relationships 
that occur between subsets of a set, but also between substructures of certain structures, such as 
groups, fields, topological spaces, and so on. The development of the theory of lattices started 
about 1930 and was influenced by the work of Garrett BIRKHOFF. 


Example J: l\f H, and Hz are arbitrary subgroups of a group G, then H, ~ H; and A, ~ H; 
* (H, »~ H,>are both subgroups of G. With these operations the set of all subgroups of G becomes 
a lattice. 

Example 2: The non-negative integers form a lattice with intersection defined as the greatest 
common divisor, and union defined as the least common multiple. 

Example 3: Certain classes of logical propositions form a lattice under the operations and 
(intersection) and or (union). 

Example 4: The subsets of a set form a lattice under the ordinary intersection and union. 

Example 5: The intermediate fields between two given fields form a lattice, with the ordinary 


intersection, and the union of two fields defined as the smallest field containing both of them 
(see Chapter 16.). 


A bijective mapping 9 from one lattice ZL; onto a lattice L, is called an isomorphism if for arbitrary 
elements a and bo f L,;: 


pla ~ b) = oa) ~p(b) and ¢(ar db) = 9(a) - p(d). 
If the operations ~ and ~ are interchanged in a lattice L,, one obtains a new lattice D(L2), the 
dual lattice of L,. A bijective mapping 9 from a lattice L, to a lattice L, is called a dual isomorphism 
if it is an isomorphism of L, onto D(L,). 


The concepts of lattice theory allow the following reformulation of the fundamental theorem 
of Galois theory (see Chapter 16.). 


Fundamental theorem of Galois theory: The mapping associating with each intermediate field 
the corresponding Galois group is a dual isomorphism from the lattice of intermediate fields to the 
lattice of subgroups of the Galois group. 


Partially ordered sets. A set S with elements a, b, c,... on which a relation a © b is defined is 
called a partially ordered set (with respect to that relation) if the relation is reflexive, transitive, and 
anti-symmetric (see Chapter 14.); anti-symmetric means that a € 5b, and b Ca together imply that 
a= b. 

It should be emphasized that the relation a € b or b Ca need not hold for all pairs of elements 
of S. 


Example 6: The set of natural numbers 1, 2, 3, ... is partially ordered by the relation ‘a divides 5’. 

Example 7: The set of all subsets of a given set is partially ordered by the set-theoretic relation 
‘A is a subset of B’. | 

Example 8: The set of all continuous real-valued functions on the interval [0, 1] is partially 
ordered by the relation / © g, meaning that f(x) = g(x) for all x in (0, 1). 


It is usual to interpret the relation € as ‘being contained in’ and to represent partially ordered 
sets by diagrams. Such a diagram is obtained by associating with each element a a small circle 
K, in the plane, and connecting the circles K, and K, by a line if a and 6 are comparable. If a C 8, 
then the circle K, lies below the circle K,. 


Rings and algebras 679 


Example 9: S consists of the subsets of the set M = {a,b,c}: My =M, M, = {a,b}, 
M, = {a,c}, M3; = {b,c}, Mg = {a}, Ms = {6b}, Mg = {c}, M7 = ©. It is represented by the 
adjacent diagram (Fig.). 

Example 10: The possible partially ordered sets of one, two, or three elements are given in the 
adjacent diagrams (Fig.). 


~Mp 33-1 Diagram of the partially ordered set S 
T (see Example 9) a O+ r 
| I i 
¢ OO @O ) , I © 
33-2 All possible diagrams of partially ordered sets hs Q +- 
M, of a) one, b) two, and c) three elements Pd ‘ . \7 


Any lattice L defines a partially ordered set S(L) with the same elements as L. One defines a € b 
if a ~ 6 = a. For certain partially ordered sets this statement can be inverted. 


Applications. The range of applications of lattice theory is extraordinarily wide on account of 
the generality of the theory. The most important examples are mathematical logic, foundations of 
mathematics, algebra, topology, and the theory of integration. Only by means of lattice-theoretical 
concepts was it possible to develop a sufficiently general concept of the integral for the needs of 
modern mathematics. 


Rings and algebras 


Rings. A set R is called a ring if two operations, addition and multiplication, are defined on it. 
Under addition R must be an Abelian group — the additive group of R-—and addition and multiplica- 
tion must be linked by the two distributive laws: a(b + c) = ab + ac, and (a+ b)c = ac + be. 

If multiplication is associative, the ring is called an associative ring and one speaks of its multi- 
plicative semigroup. If the multiplication is also commutative, the ring is called a commutative ring. 
Integral domains and fields are special kinds of commutative rings. 


Further examples of rings are 1. the non-associative ring of vectors in three-dimensional 
Euclidean space, with vector addition and vector multiplication as the operations; 2. the associative, 
but non-commutative ring of (n * n)-matrices under matrix addition and multiplication. 

The investigation of rings, a particularly important part of current research in algebra, was 
decisive for the development of abstract algebra in our century. The analysis of algebraic structure, 
which today is standard practice in algebraic research, was suggested by Emmy NoeETHER (1882-1935) 
and first put into practice by her and her pupils in important examples. Their investigations gave 
algebra completely new impulse and led to new areas of application. 


Algebras. The concept of an algebra was developed in connection with rings that are also vector 
spaces. 

An algebra is an associative ring A whose additive group At is a vector space over a field K and 
in which multiplication by scalars of the field commutes with ring multiplication: (au) v = u(av), 
aé€Kand u, veA. 

The dimension of an algebra A as a vector space is called its rank. 


Examples: 1. The continuous and the differentiable real or complex valued functions on an inter- 
val form an algebra. 
2. The real or complex (n = m)-matrices form an algebra of rank n?. 


Structure constants. Since every element of an algebra of finite rank n can be represented as a 
linear Pampinanon of basis elements U,, ---, U,, One Can write the product of two elements u = z OU; 
and v = Z Bjuj as uv = 3 a,Bj(uju;). From this it follows that all products uv can be computed if 
the srodnce u;u, of the basis elements are known. These products must themselves be linear com- 
binations of the basis elements u,u; = 2 VijUe- 


The n°? constants y*, are called the structure constants of the algebra. They determine the multiplica- 
tion in the algebra completely by means of the equations for uv and u,u;. 


680 34. Topology 


Example: The algebra H of rank 4 with the basis elements 1, i, j, k, and the products 17 = 1, 
li=i, 1j=j, 1k=k, Pej =k? = —1; ij =k, jk=i, ki=j and ji=—k, kj = —i;, 
ik = —j is called Hamilton's quaternion algebra. It contains the complex numbers a + hi as a 


subfield. 


Applications. Apart from the applications mentioned under field theory, in which ring theory 
plays a decisive part, there have been important applications recently in functional analysis. By 
introducing a generalization of the absolute value (see Chapter 40.) one arrives at the concept of 
a normed or Hilbert algebra. The theory of normed algebras is an important tool in analysis and 
in topological algebra. 


Representation theory 


The theory of representations is closely connected with the theory of algebras. It deals with the 
problem of mapping a group, ring, or algebra homomorphically, that is, in 1 certain sense without 
destroying its structure, into a group or ring of matrices or linear transformations of a vector space. 

This vector space is called the representation space. The determination of the representations of 
a group or algebra is of importance not only in analyzing its structure, but also for many applications 
in physics and chemistry, for instance, in quantum mechanics. Furthermore, representation theory 
can be regarded as an ordering principle in geometry and generalizes the theory of invariants, which 
flourished at the beginning of our century. 


Representation of a group. Consider as representation space a complex vector space V. A represen- 
tation of a group G over V is a homomorphism of G into the general linear group GL(V) of non- 
singular linear transformations of V. If the dimension of V is n, one speaks of an n-dimensional 
representation. In that case every linear transformation can be represented after choosing a basis 
in V by an (n X n)-matrix, and one obtains a homomorphism of G into GL(z). Such a homo- 
morphism is called a matrix representation of the group G. 

The concrete description of representations can be particularly difficult. For certain important 
groups methods of finding all possible representations have therefore been specially developed; 
for example, for the group of all permutations of a fixed number of elements. For infinite groups, 
such as topological groups, the problems become particularly difficult. However, many questions 
have been solved for particular Lie groups, such as the rotation group and the Lorentz group. 


Applications. Apart from the applications already mentioned, representation theory is frequently 
used in analysis. Thus, the representations of the rotation group, that is, the group of rotations 
of the sphere in 3-dimensional space, lead to a deeper theory of spherical functions, while representa- 
tions of other groups are used to find important properties of the Bessel functions and others. 

Naturally, the representations of the Lorentz group are important in physics. 


Conclusion 


Summing up one can say that today algebra is the theory of algebraic structures. An algebraic 
structure is a set on which certain operations (addition, multiplication, intersection etc.) are defined, 
where the exact nature of the objects of the set is irrelevant to the investigation. This concept of 
structure, which was developed at the beginning of this century, has been of the greatest importance 
for algebra, and has moulded algebraic thinking for the past 50 years. It has been modified and 
extended to other areas of mathematics, such as topological, or differentiable structures. 

In recent years a series of new concepts have been developed in connection with other mathematical 
disciplines; however, their investigation is still in a state of flux, and their importance in many 
cases is not yet sufficiently clarified. 


34. Topology 
The topology of point sets 6. ..cccceceees 680 Topological structures ...........cc eee 685 
n-dimensional SPQC€S .... 0... ccc e eee ees 684 


The topology of point sets 


In several theorems of analysis and geometry the connectivity of a figure may play an essential 
part; for instance, the simple theorem that a function is one-to-one if its derivative is everywhere 
non-zero is only valid if the domain of definition of the function is connected. It is easy to find a 


The topology of point sets 681 


function that is defined on the open intervals (0, 1) and (2, 3), is not one-to-one, but has the derivative 
1 at every point of the intervals (Fig.). The situation is more complicated when connectivity in the 
plane is discussed. The theorem that a vector field with zero curl has a potential holds, in general, 
only if the domain where the field is defined contains no holes; such a domain is called simply- 
connected. For example, one can choose the lengths of the vectors of a vector field in such a way 
that the curl becomes zero (Fig.), although it has no potential; this can be seen by integrating along 
the curve marked in the illustration. Investigations of the connectivity properties of figures suggested 
by these and similar examples form a small but characteristic part of topology. 


7 é i ie 
34-1 The represented function is not one-to-one 


34-2 Vector field with zero curl and no potential 


The concept of a figure. To begin with, the figures to be investigated lie on a straight line, in a 
plane, or in three-dimensional Euclidean space E%. It is to be expected that the situation becomes 
more complicated as the dimension increases. On the line all that matters is how many parts the 
figure has, but in the plane one must also consider how many holes each of the parts encloses; and 
in three dimensions there are, in fact, two different kinds of holes, cavities like the holes in a Swiss 
cheese and channels like the holes in a sieve. 

A figure is generally defined to be a set of points in the space under consideration. Therefore 
figures are also called point sets. The definition leads to very complicated examples, difficult to 
visualize, such as the set of all points in the plane for which one coordinate in a Cartesian coordinate 
system is rational, the other irrational. Although the following remarks are also valid for such com- 
plicated point sets, it is sufficient to focus attention on ‘sensible’ figures, for example, intervals on 
the line or surfaces in the plane or surfaces and solids in three dimensions. 


Homeomorphic point sets. Before statements can be made about the connectivity of figures, there 
must be a precise definition of what it means for two figures to have the same connectivity; such 
figures are called homeomorphic. Intuitively two figures X and Y are homeomorphic if X can be 
bent and stretched and deformed into Y, without any tearing or pasting together any parts of X 
(Fig.). If X is transformed into Y in this manner, then with each point p of X there is associated a 
unique point f(p) in Y and vice versa, that 
is, the map f that associates with each 
point p of X its transform f(p) is a: bijec- 
tion of X onto Y. The condition that , 
there should be no tears implies for f 
that if two points p and q of X are suffi- 


ciently close together, then their images — — 
f(p) and f(g) are also close together. This — — 
condition can be made precise by using 
the distance d(p,q) between the points 
and then becomes analogous to the con- 
dition for continuity of functions of a real 
variable. 
ee mee a al ——— 
—=1—___- ————_ = 


34-3 Tearing and pasting of surfaces 


Continuity expresses the fact that the deformation f introduces no tears, but the idea that no 
points are pasted together and that no cavities are filled in remains to be made precise. This can be 
done by considering the inverse map f~‘, which associates with each point p’ of Y the unique 


682 34. Topology 


point p of X (which exists because fis a bijection) such that f(p) = p’. To say that f does not past 
anything together is the same as saying that f—! introduces no tears, or that f~1 is continuous. Thus, 
it is now possible to give a precise definition of homeomorphism. 


34-4 Projection of a circle 34-5 Projection of a square 34-6 Mapping of a circle onto 
onto a diameter onto a circle a half-open interval 


Examples: 1. The perpendicular projection of a circle onto a diameter is a continuous map, 
but not a homeomorphism, since it is not one-to-one (Fig.). 

2. The central projection of the boundary of a square onto a circle is a homeomorphism (Fig.). 

3. The map from a circle C, to the half-open interval [0, 27) that takes each point p of C to its 
associated angle g(p), is not continuous at the point pg with angle 0; for the images of points 
near Po lie far apart, at opposite ends of the interval (Fiz.). 


The definition of a homeomorphism does not 
exactly correspond to the intuitive idea of deform- 
ing one figure into another without tearing or 
bending. It must also be permitted to cut the 
figure, on condition that after the deformation has 
taken place the cut is pasted together again point 
for point, exactly as it was initially. Of the four 
Strips B,, B., B;, B, in the figure, B; and B, are 
homeomorphic to B,, but the Mébius strip B» 
is not. B, is obtained from B, by cutting, twist- 
ing once, and pasting together again, B; is twisted 
twice, and in B, the strip is stretched and then 
knotted. B, and B, are not homeomorphic, since 
B, has two boundary curves, whereas B, has 
only one. B, and B3 are homeomorphic, since 
the black boundary curve of B, can be mapped 
to the black boundary curve of B; and the red 
34-7 Non-twisted, twisted and knotted strips boundary of B, to the red one of B; in such a way 

that opposite points p and p’ are mapped to 
opposite points f(p) and f(p’). If then each straight-line segment between opposite points p and 
p’ is mapped to the corresponding segment between f(p) and f(p’), the result is a homeomorphism 
from B, onto B3. Nevertheless B, cannot be deformed into B;, it must be cut along a segment 
Pp, p’, twisted, and then pasted together again. If B, is transformed into B, in a similar manner, 
then points are pasted together that previously were not close to each other. 

Homeomorphic maps actually occur in daily life, for example, in schematic maps for underground 
railway or tramway networks, showing points of transfer. 


Topological properties. Properties of sets that depend only on their connectivity are called topo- 
logical. Any such property of a set is shared by all its homeomorphic images. Thus, topological 


theorems are statements about topological properties of point sets. An example is the Jordan curve 
theorem. 


Jordan curve theorem. Every simple closed plane curve divides the plane into two parts. 


This is a statement about the properties of being a closed curve without self-intersections or simple 
closed curve and of dividing the plane into two parts. Both these properties are topological; for on 
the one hand, every homeomorphic image of a simple closed curve must again be a simple closed 


The topology of point sets 683 


curve —indeed one can define a simple 
closed curve to be a homeomorphic image 
of the circle; on the other hand, division 
into two parts means that the complement 
to the curve consists of two disconnected 
parts, which is also a topological property. 
The theorem itself, which seems almost 
obvious is not at all easy to prove. One 
begins to appreciate this if one realizes that 
a curve C in a plane can be extremely 
complicated, so that at first sight it is very 
difficult to decide whether a given point 
P is inside it or not (Fig.). 

Apart from theorems that are state- 
ments about topological properties, other 
theorems are also regarded as belonging 
to topology if they are concerned mainly 34-8 Simple closed curve C with exterior point P 
with topological concepts, such as con- 
tinuous maps in Brouwer’s fixed point theorem. 


The Brouwer fixed point theorem. If Cis a closed circular disc including the perimeter, then 
every continuous map of K into itself has a fixed point, that 1s, a point that is mapped to itself. 


Thus, if the disc is distorted in such a way that the resulting figure lies entirely inside it, then at 
least one point occupies its original position. 

One of the main tasks of topology is to decide whether two given figures X and Y are homeomorphic. 
If this is true, it can frequently be proved by exhibiting a specific homeomorphism from X to Y 
by trial and error. If they are not homeomorphic, the proof is usually more complicated. The basic 
method for such proofs is to find a topological property that is satisfied by one of the figures, but 
not by the other. If this is the case, then X and Y cannot be homeomorphic, for if a point set has a 
topological property, then so does every point set homeomorphic to it. This method requires famil- 
iarity with a large number of topological properties; here is a list of some of the simplest and most 
important. 

A point set Z is called connected (more accurately: path-connected) if any two points p and q 
in Z can be joined by a path in Z, that is, if there is a continuous map of an interval into Z that 
maps the end-points of the interval to p and q, respectively. Thus, figures are connected if they do not 
consist of several disjoint parts. Connected figures in the plane and in space can still have holes. 
Certain types of these are excluded by requiring the set to be simply-connected. A set Z is simply- 
connected if any closed curve in Z can be contracted inside Z to a point. It is intuitively clear that 
simply-connected plane figures can have no holes, since a curve going around such a hole cannot 
be contracted to a point inside Z. On the other hand, for three-dimensional figures the condition 
excludes channels, but not cavities; for instance, a hollow ball is simply-connected, even though 
it is hollow and thus has a cavity. A sieve, however, is not simply-connected. 

This type of topological property, relating to holes in figures and their connectivity, was the 
starting point of algebraic topology, in which the connectivity of figures of arbitrarily high dimension 
is described by certain algebraic structures associated with the figures, such as the homology and 
homotopy groups. 

With any point set X one can associate a natural number dim X, the so-called dimension of the 
point set. For familiar figures, such as a curve C, a surface S, or a body B, for which there is an 
intuitive notion of what the dimension should be, dim X has the expected value, namely dim C = 1, 
dim S = 2, and dim B = 3. In dimension theory a precise definition of dimension makes it possible 
to prove that homeomorphic point sets X and Y have the same dimension, dim X¥ = dim Y. Thus, 
the dimension of a point set is a topological property, and, for example, a curve can never be 
homeomorphic to a surface. 


Neighbourhood of a point. Apart from connectivity and dimension, topology includes the study 
of further properties of point sets, some of which are also important in other branches of mathematics 
such as the differential calculus. In explaining the concept of the derivative at a point x of a function 
defined on a closed interval J it matters very much whether x is an end-point or not. In the former 
case one can only talk of a left or a right derivative. The situation is similar for a function of two 
variables f(x, y) defined on a plane domain D. In defining the partial derivatives or the total dif- 
ferential of f at a point here also it is necessary to distinguish whether the point lies in the interior 
or on the boundary of D. Indeed, one frequently has to distinguish between interior points and bound- 
ary points of a figure. The first step in making these concepts precise is to introduce the idea of a 
neighbourhood of a point. One defines the e-neighbourhood U,(p) of a point p as the set of all points 


684 34. Topology 


whose distance from p is less than e, where « is an arbitrary positive number. In this context it is 
important to distinguish whether the point p is regarded as an element of a line, a plane, or three- 
dimensional space. In the first case the e-neighbourhood is an open interval (that is, without the 
end-points) of length 2, in the second it is an open disc of radius e, and in the third a solid ball 
of the same radius. Note that the circumference of the disc and the surface of the ball are not counted 
as belonging to them. 

In any figure F one distinguishes between interior points p, which have an e-neighbourhood entirely 
contained in F, and boundary points q, for which every e-neighbourhood contains some points not 
in F (Fig.). One sees that the dimension of the space in which F lies plays a decisive role in these 
definitions. For instance, in the plane the centre M of a disc K is an interior point, but in three 
dimensions the disc K consists only of boundary 
points, since no ball of positive radius lies inside K. 
Open sets are sets without boundary points, they 
consist only of interior points, for example, a 
ball in space without its boundary sphere. In the 
theory of functions of a complex variable open 
and connected sets in the plane play a special 
part; they are called domains. 

34-9 Boundary point q and interior point p A point p is said to be adherent to a set X if 

every €-neighbourhood of p contains at least one 
point of X. Intuitively, the set of points adherent to X consists of those points which, if not actually 
contained in X, are at least ‘infinitely close’ to X; the end-points of an open interval, for example, 
do not belong to the interval, but they are ‘infinitely close’ to it. The set of points adherent to an 
open disc K of radius r about a point z are, apart from the points p of the disc themselves, the points 
q on the perimeter of the disc, for which d(q, z) = r. If the only points adherent to a set F are the 
points of F itself, then F is called closed. The open disc K above can be made into a closed set by 
adjoining all the boundary points q with d(q, z) = r. The essential relationship between open and 
closed sets is the fact that a set G on a line, in a plane, or in three-dimensional space is open if and 
only if its complement, that is, the set of all points of the space in question that are not in G, ist 
closed. 

For the purpose of later generalizations one defines relative e-neighbourhoods U,(p) of a point p 
and relative open sets with respect to a subset X. The relative e-neighbourhood of p in X consists 
of the points g in X with d(q, p) < «, that is, it is X¥ ~ U,(p). Similarly, a subset G of X is called 
open in X if every point p in G has a relative e-neighbourhood in X entirely contained in G. The 
empty set, which is a subset of every set and hence also of X, is also regarded as open in X. 


The following statements hold for the system of subsets of X that are open in X: 1. X itself and 
the empty set are both members of the system, that is, are open in X. 2. The union of arbitrarily 
many sets that are open in X is also open in X, 3, The intersection of finitely many sets that are open 
in X is openin X, 


With these concepts of relative e-neighbourhoods and open sets a new definition of continuity 
for maps f from one set X to another Y can be given. 


A map f from a point set X to another point set Y is continuous if and only if for every point 
p in X and every (relative) e-neighbourhood V of f(p) in Y one can find a 6-neighbourhood U of p 
in X, that is mapped by f to a subset of V (Fig.). 

A map f from X to Y is continuous if and only if the inverse image of every open set in Y is 
open In X, 


34-10 Continuous map 


The inverse image A of a subset B of Y under a map fis the set of all points in X that are mapped 
by f to points of B. 
n-dimensional spaces 


The exposition so far has been restricted to figures on a line, in a plane or in three dimensions, 
that is, to the Euclidean spaces E!, E* and E>. The starting point for their generalization to n-dimen- 


n-dimensional spaces 685 


sional Euclidean space E” is the observation that every point p in E> can be described by a triple 
of coordinates (x,, x2, x3). The so-defined map from E®° to the set of all triples of real numbers 
is One-to-one and onto, and as long as a fixed Cartesian coordinate system is retained, the points 
of E> can be identified with their coordinate triples. The space E” is defined similarly. 


er aint Soar Far 


of the distance in E*, where the triangle inequality states that in a triangle with the vertices p, q 
and r each side is at most as long as the other two together. 


Using this distance d(p, q) it is now possible to define topological properties for subsets X¥ and Y 
of E” similarly to the way this has been done for E” with n = 1, 2,3. A mapping f from a set X 
to a set Y is continuous if for every point p in X and every positive number « there exists a positive 
number 6 such that d(p, g) < 6 implies that d(/(p), f(q)) < «. Such a mapping is a homeomorphism 
if it is bijective (that is, one-to-one and onto) and both fand f-! are continuous. Two sets X and Y 
are called homeomorphic if there exists a homeomorphism from_X to Y. Finally, the e-neighbourhood 
of a point in X can be defined in exactly the same way as above, so that all the basic concepts have 
been extended to cover the more general sets now under consideration. 

Higher-dimensional spaces are of use in the examination of objects or conditions that cannot 
be described by at most three coordinates, but can be described by finitely many. For instance, in 
physics an event is fixed not just by its three coordinates in space, but also by a time coordinate. 
Thus, every such event corresponds to a point in E*. If it is desired to describe not just one event 
but several, such as in a continuing process, then one obtains a subset of E*. A similar situation 
is encountered in physical systems with several degrees of freedom. But of course, this way of looking 
at things is only worthwhile if one succeeds in stating and proving interesting and practically useful 
geometrical or topological theorems in higher-dimensional spaces. Two examples of such theorems 
follow. 


The Jordan-Brouwer theorem. In generalizing to E> the Jordan curve theorem, which was stated 
for E”, the closed curve has to be replaced by a 2-dimensional topological sphere. This is any subset 
of E> that is homeomorphic to the surface of a sphere, or intuitively speaking, a deformed sphere. 
Every two-dimensional topological sphere divides E> into two parts. 

In E” an n-dimensional ball is defined by analogy to that in three dimensions as the set of points p 
whose distance from a given fixed point z is at most a fixed number r, d(p, z) < r. The surface of 
the ball ist the set of those points p for which d(p, z) = r. An (n — 1)-dimensional sphere is a subset 
of E" that ist homeomorphic to this surface; again this can be described intuitively by saying that 
it is the deformed surface of an n-dimensional ball. The Jordan-Brouwer theorem and the Brouwer 
fixed point theorem can now be stated in full generality. 


The Jordan-Brouwer theorem. Every (mn — 1)-dimensional topological sphere divides E" into 
two parts. 

The Brouwer fixed point theorem. If / is a continuous map of an n-dimensional ball into itself, 
then f has a fixed point. 


Topological structures 


Incomparably more far-reaching generalizations of the intuitive topological concepts can be made 
by introducing into topology the idea of a structure on a set, which was first developed in algebra. 
There structures such as rings and fields are obtained from number systems by a process of ab- 
straction in which properties of numbers that are inessential for algebra are ignored and only those 
are retained that are needed as foundation for algebraic investigations. If a similar course is to be 
followed in topology, then it is first necessary to determine what properties of point sets form the 
basis of topological arguments, and then to define a general structure for which the only requirement 
is that these prerequisites for topological investigations are realized. One of the essential properties 
of point sets leading to topological concepts is the possibility of defining continuity for maps between 
the point sets, because many other topological concepts, such as homeomorphisms and connectivity. 
are defined in terms of this fundamental idea of continuity. 


686 34. Topology 


A topological structure will therefore be defined as a set T with certain properties that make it 
possible to declare for a map f from T to another such structure T’ whether f is continuous or not. 
It is plausible to regard the metric spaces of functional analysis as a suitable structure, because 
continuity can be defined as soon as one has a concept of distance (see n-dimensional spaces), and 
metric spaces are by definition sets in which a distance d(p, q) is defined for all pairs of elements p, q. 
Although this is a possible course, for some topological investigations metric spaces have been 
found to be too restricted. The final definition is arrived at by observing that continuity of a map f 
from a point set X to a point set Y can be defined as soon as the open sets of X and Y (see Topological 
properties) are known. For f is continuous if and only if the inverse image of every open set in Y 
is open in_X. 

Therefore the following definition of a topological space is a suitable structure for the study of 
topology. 


These three axioms correspond exactly to the properties of the system of open subsets of a point 
set listed above, so that any point set together with its (relatively) open subsets forms a topological 
space 7, the definition of continuity just repeated at the end of the previous paragraph makes sense 
for maps from a topological space T to a topological space 7’. A map f from T to T” is called a 
homeomorphism if it is bijective and f and f~! are both continuous. Two topological spaces T 
and 7’ are called homeomorphic if there is a homeomorphism mapping 7 to T’. A topological 
space TJ is (path-)connected if to any pair (p, qg) of elements of 7 there exists a continuous map of 
an interval into J such that the end-points of the interval are mapped to p and gq, respectively. 

These examples show how concepts defined for point sets can be extended to arbitrary topological 
spaces. However, general statements about topological spaces tend to be far less intuitive in geometric 
terms than those about point sets. Further, there are concepts for point sets that elude generalization 
to arbitrary topological spaces; for instance, it is not possible to give a satisfactory definition of 
dimension for all topological spaces. General or set-theoretical 
topology, which is the study of arbitrary topological spaces, can 
in many of its parts hardly be regarded as belonging to geometry. 
Rather, it assumes the character of a structure theory comparable 
to group theory in algebra. Just as in group theory special classes 
of groups such as Abelian groups, are examined, so in general 
topology topological spaces are investigated that satisfy apart from 
1., 2. and 3. above, further axioms for example, the Hausdorff 
separation axiom: To any two elements p and gq of T there exist dis- 
34-11 The Hausdorff separa- joint open subsets X and Y of T such that pe X and qé Y (Fig.). 
tion axiom Topological spaces are more general than metric spaces, so that, 

in particular, every metric space is a topological space. That is to 

Say, One can distinguish in any metric space the system of open 
sets. These open sets are defined just as in Euclidean spaces: Let M be a metric space, p an element 
of M, and « a positive number; the e-neighbourhood of p in M is the set of all elements of 4 whose 
distance from p is less than e. A subset X of M is open if for every element p of X there is an e-neigh- 
bourhood of p entirely contained in X. It is not difficult to prove that the system of open sets so 
defined has the properties 1., 2. and 3., and that in this way M indeed becomes a topological space. 
This remark has the important consequence that theorems and concepts of general topology are 
applicable to metric spaces and, in particular, to the investigations of functional analysis. 

As an example of an important problem of general topology the metrization problem may be 
quoted: under what conditions on the system of open sets O of a topological space T is T metrizable, 
that is, when can a distance function d be defined on T that makes T into a metric space whose open 
sets are precisely the sets in O? — It is easy to see that a necessary condition is the Hausdorff separa- 
tion axiom, but this is not sufficient. The metrization theorem of NAGATA and SMIRNOV gives neces- 
sary and sufficient conditions for the existence of such a metric on a topological space. The 
precise formulation of these conditions, however, exceeds the scope of this chapter. 


35. Measure theory 687 


35. Measure theory 


Measure theory deals with the determination of the centent of geometrical configurations, or 
more generally, of point sets. It is directly connected with the integral calculus and set theory and 
finds important applications in many branches of analysis and in the foundation of probability 
theory. In contrast to the calculation of the areas of triangles, rectangles and other figures bounded 
by straight lines, figures bounded by curved lines or even more complicated ones present difficulties. 
Even to explain what one understands by the content of a point set is a problem. Its first solution, 
the concept of Riemann content, leaning heavily on the concept of the Riemann integral, was given 
in 1890 by Giuseppe PEANO (1858-1932) and Marie Ennemond Camille JoRDAN (1838-1922). 

In order to arrive at the content of a 
point set (for example, in a plane), a square 
grid is laid over the plane and the given 
figure is approximated from within by a 
region consisting only of squares of the 
grid (yellow in Fig. 35-1). An outer ap- 
proximation region contains the figure in ~- 
its interior (yellow and blue). If by halv- 
ing one proceeds to a finer grid, then the 
new inner approximation contains the old 
one and is usually larger by a certain num- 
ber of the new squares, whilst the new 
outer region results by deleting new squares 
from the old region. Thus, the difference 
between the areas of these approximate 
regions can only become smaller. If now | |. hots ff B 
with continuing refinement the inner and 35 | Approximations with 35-2 Configuration 
outer contents approach one another arbi- respect to Peano-Jordan content without Peano- 
trarily closely, then their common limiting Jordan content 
value is called the Peano-Jordan content, 
or just the content, of the given figure. 

This concept of content yields the well-known area formulae for figures bounded by straight 
lines, as well as for the circle and ellipse among others. However, there exist point sets to which 
one can ascribe no Peano-Jordan content. Fig. 35-2 represents a square ABCD on whose upper 
side CD a perpendicular of length equal to the side of the square is erected at every one of the in- 
finitely many points whose distance from the vertex C is a rational number. In this case all the outer 
contents are at least twice as large as the inner ones and the two do not tend to a common limit 
as the grid is refined, because the whole square CC’D’D always belongs to the outer approximation 
and only to that one. Every grid square lying in CC’D’D, no matter how small, contains both points 
that do and others that do not belong to the figure. 

Lebesgue measure. In modern mathematics precisely such point sets, which at first sight seemed 
exceptional, gained considerable significance. In very many cases a more comprehensive concept 
of content was successful, the Lebesgue measure, which was developed in 1902 by Henri Léon 
LEBESGUE (1875-1941). In contrast to the Peano-Jordan content, the approximation figures may 
also consist of infinitely many elementary areas of different magnitudes. 

Point sets that have a content are also measurable, and their measure is numerically equal to their 
content. On the other hand, there exist point sets to which no content can be ascribed, but which 
do have a measure; for example, the measure of the configuration in Fig. 35-2 is equal to the 
content of the square ABCD; the subset consisting of the perpendiculars is of measure zero. Sets 
of measure zero play a particularly important part both in pure mathematics and also generally 
in the mathematical description of natural processes; they characterize, so to speak, the inessential. 
The considerations in the case of spaces of other dimensions are completely analogous; for example, 
for dimension three one obtains the ordinary volume, or space measure. 

In the integral calculus the use of measure in place of content leads to the Lebesgue integral. It 
represents an extension of the concept of the Riemann integral, just as the Lebesgue measure is an 
extension of the concept of the Peano-Jordan content. 

In the process of further abstraction, in general measure theory one understands by a measure 
on a set 22 a real-valued function m(A) whose argument runs through certain subsets A of §2 and 
which has properties corresponding to the simplest geometrical interpretations. In the first place, 
m(A) > 0 and m(A v B) = m(A) + m(B) for disjoint point sets A, B. The sets A belonging to the 
domain of definition of m are said to be measurable with respect to m. This approach makes it 
possible, for example, to apply measure-theoretical theorems directly to probability theory; in this 
a random event is regarded as a subset A of the ‘point’ set 2 of all elementary events, and the measure 
m(A) as the probability of the event A. 


—T : 
a Tk 
4) 


=a | 
. | 


2 


= 

Rita 
vee 
Pe 


688 36. Graph theory 
36. Graph theory 


Foundations ..... 0... cc ccc ccc ew ee ecees 688 Network techniques .........cccccccceees 690 
The four colour problem .........000ee0e 690 
Foundations 


Directed and undirected graphs. A graph G = [X, U, f] is a combination of two sets of elementary 
figures, the set X of nodes x and the set U of edges u, and a function f defined on U, the incidence 
function. This assigns to each directed edge u € U exactly one ordered pair of nodes x;, x, € X and 
to an undirected edge one unordered pair (Fig.). 

The nodes assigned to an edge need not be distinct. If x; = x,, then f(u) = (x,, x,) is called a 
loop. The function f need not be uniquely reversible, that is, a pair (x;, x,) can be assigned to several 
edges, which are then called parallel edges or multiple edges. 

Each of the sets XY and U can be finite or infinite. If X and U are finite, the graph is called finite. 
All the following arguments refer to finite graphs. 

To represent the graph, nodes are drawn as points and edges as arcs of curves joining the nodes 
assigned to them. According as the order of the nodes in a pair matters or does not, one speaks 
of directed or undirected graphs (Fig.). 


a D 

u — a | 
x: | 
ordered pair unordered pair x =x, 
a: XM, 


rk Are Ay 
36-1 a) Directed edge, b) undirected 36-2 Directed graphs 36-3 Undirected connected 
edge c) loop graphs; the right-hand one is 
complete 


Example: The street plan of a town is usually represented on a map by an undirected graph. 
This is completely sufficient for pedestrians. However, if there are many one-way streets, a car 
driver needs a representation of the street plan as a directed graph. 


Applications. The five Platonic solids (tetrahedron, cube, octahedron, dodecahedron, icosahedron), 
like all other polyhedra, represent graphs with their vertices and edges as the nodes and edges of 
the graph. The boundaries between countries on a geographical map form a graph, and so does 
the railway system, the shipping routes and the airlines. 

All communication networks, such as telephone or tele- 
printer networks, can be represented as graphs, and elec- 
trical, water and central heating networks can also be describ- 
ed by means of graphs. In systems theory and cybernetics, 
complex systems are considered whose structure can be repre- 
sented with the help of graphs, for example, block switch dia- 
grams, signal flow diagrams and the production structure of 
a business. A practical branch of graph theory consists of 
network techniques. The starting times and connections of 
the parts of a complete process are transferred to a network 
and can then be calculated and directed (see Network techni- 

ues). 

Generally it can be said that graphs are a tool in solving 
combinatorial problems. 


Example; A ferryman F wishes to take a wolf W, a goat 
G and a cabbage C from the left bank of a river to the 
right by means of a boat which can only hold two out of 
F, W, G, C. The wolf and the goat must never be left 
together unguarded, nor the goat and the cabbage. How 
can this be achieved! — 


| 36-4 Combinations on both banks for the transport by a ferry- 
© /wec| FG) man F of a wolf W, a goat G and a cabbage C; one of the possible 
sequences of connected edges is coloured 


Combinatorial structures 689 


One first classifies the admissible combinations on the two banks. For example, one is(FWG/C), 
which means that F, W and G are on the left bank and C on the right. A zero means that none 
of the four things is on the corresponding bank. A combination in which the boat is on the right 
bank is now connected to a combination in which the boat is on the left if the ferryman can get 
from one to the other in one journey. The combinations and the connections are the nodes and 
edges of a graph (Fig.). The problem can now be solved by looking for a connected sequence of 
edges that begins at (FW/GC/0) and ends at (0/F WGC). There are several of these. 


Special graphs. A graph in which any two distinct nodes are joined by exactly one edge is called 
complete. If the graph consists only of isolated nodes, that is, the set of edges is empty, it is called 
a null graph. If one can go from any node to any other node along the edges, the graph is called 
connected. A complete graph is connected. 

By repeatedly going from one edge to another, through a 
node common to the two edges, one obtains a sequence of edges. 

If each edge in a sequence of edges occurs only once, one 
speaks of a path. This is called closed if the starting point and 
end-point coincide. A closed path in which no node except the 
Starting point occurs twice is called a (topological) circle. A 
connected non-empty graph is called a tree if it contains no 
closed path (Fig.). Trees are used, for example, in structure 
formulae of chemistry for chains of carbohydrates. Finally, a 
graph is called planar if it can be drawn in a plane or, what 
is topologically the same thing, embedded in the surface of a 
sphere, without the edges intersecting. For example, trees are 36-5 Trees 
planar graphs. 


Combinatorial structures. Graph theory is concerned with combinatorial matters. The object 
of graph theory is not only the determination of numbers, as in elementary combinatorial theory, 
but the combinatorial structure itself. Investigations of combinatorial structure, that is, graph- 
theoretical considerations, were first used by Leonhard EuLER in 1736. He started with the problem 
of the seven bridges of Kénigsberz, that is, the problem whether one can take a walk in which each 
of the seven bridges is crossed exactly once (Fig.). It can be seen immediately that the walk neither 
begins nor ends in at least two of the four regions I to IV. These regions are entered and left again. 
However, since an odd number of bridges leads to each region, such a walk is not possible. EULER 
investigated more generally under what conditions a given connected 
graph can be described in a closed path in such a way that each edge x e 
is covered exactly once. An Euler path of this kind exists if and only if 
an even number of edges meet at each node. : = 

The graphs one comes across in practice often have a very general : 2 
structure, and the main question is that of an algorithm for the effective 
solution of an optimization problem connected with the graph. This distribution of nodes 
is exemplified by the problem of the cheapest telephone network. (ull graph) 


f | 
36-6 The seven bridges of of / 


f \\ KOnigsberg / gn Sr 
I | a first step 
I / 


36-7 Minimal tree second step 


Example: To connect n places by a telephone network at minimum cost, where the branch 
points occur only at the places themselves. The costs of the direct connections between any two 
places are known. Bitte 

The required network is obviously a tree. A simple algorithm serves to construct It. First step: 
one joins each node to the one for which the cost is least, thus obtaihing a system of trees. Second 
step: one contracts each tree to a point and repeats the process. If one continues in this way, one 
breaks off the process when only one tree remains. This is the required minimal tree. In Fig. 36-7 


690 36. Graph theory 


the process is carried out for 10 places in such a way that the distance between any two points are 
proportional to the costs. 

If one wishes to solve the problem by trying all possibilities, one would have to test n"~? pos- 
sibilities for n places, that is, 10° possibilities for » = 10 places. 


The four colour problem 


Mapmakers know that any political map can be drawn with four colours in such a way that any 
two countries with a common frontier (not just a point) are coloured differently. It is fairly easy 
to prove that five colours are always sufficient to colour any map. The problem of whether four 
colours are always sufficient proved to be very hard, was unsolved for about 100 years and therefore 
of stimulating influence for the development of graph theory. 

In a map (1) (Fig.) the countries and their frontiers can always be represented by a graph in two 
different ways: either the meeting-points of three or more frontiers are the nodes and the frontiers 
between them are the edges (2a) or the countries are the nodes and the edges denote neighbouring 


countries (2 b). In the first case the areas are to be coloured, in the second case the nodes. A map 
drawn on the surface of a sphere is called normal if exactly three frontiers meet at each node and 
each country is bounded by a (topological) circle. The four-colour problem can be reduced to that 
for normal maps. If there is a topological circle or Hamilton circuit (Fig.) containing all the nodes 
of the graph of a normal map, then the countries of the map can be coloured with four colours, 
since two colours are needed for the inside and two for the outside of the Hamilton circuit. For a 


long time it was believed that a Hamilton circuit always exists. Counterexamples were given only 
in 1965, so this method does not solve the four colour problem. 


36-8 Map (1) with its graph (2a, 2b) 


36-9 Graph of a normal map with Hamilton 
circuit (see Fig. 36-8, 2a) 


The graph of a normal map is a cubic graph, that is, exactly three undirected edges meet at each 
node. Moreover, this graph contains no edges whose deletion would divide the graph into two 
separate parts, that is, the graph has no bridges. For a proof of the four colour problem it would 
be sufficient if one could show that any cubic graph can be described by several circles which all 
have an even number of edges. Without assuming planarity PETERSEN was able to prove that any 
cubic graph without bridges can be described by several circles in such a way that each node belongs 
to exactly one of the circles. Of course, some of these circles may have an odd number of edges. 

Further work on the four colour problem used topological or combinatorial methods. The final 
solution which was reached in 1976 by APPELL and HAKEN worked with a method originally devised 
by KEMPE in 1879 which uses a reduction procedure for graphs and finally leads to a very extended 
distinction of cases. About 1800 cases had to be considered — each one the four colouring of a 
graph which is “interior” to a circuit of length <14. This was done with a fast computer and needed 


about 1200 hours of computing time, thus providing a new type of mathematical proof characterized 
by the necessary use of a computer. 


Network techniques 


Network techniques are applied to represent, analyze and optimize the progress of complicated 
processes, for example, the erection of large buildings, which are composed of several partial processes. 
The aims of network techniques are: the planning of the finishing time and intermediate times and 
the search for spare time for the partial processes; the determination of the most advantageous se- 
quence for the partial processes to shorten the total time, lower the cost and improve the utiliza- 
tion of capacity; the development of a control system and limitation of responsibility. 


Network techniques 691 


Activities and events. A network is a directed graph whose elements are expressed as activities 
and events. Activities are partial processes or parts of the work; to them there correspond durations 
of time; events are the attainments of individual steps of the process or occurrences of individual 
stages of completion; to them there correspond points of time. Fictitious activities are those of zero 
duration, which only express the dependence of actual activities. If the activities and events are 
represented by the edges and nodes of the network, then it is an event-oriented network. Conversely, 
if the activities are represented by the nodes and the interdependence of the activities by the edges 
of the network, then it is an activity-oriented network. Here the edges essentially correspond tothe 
events, while the dependence of the activities usually consists of the fact that one activity must be 
finished before the next can begin. The following arguments refer to an event-oriented network. 


Example: For the building of a machine shop with access road and grounds, the following work 
has to be done if wu signifies a unit of time: 


1.1. Laying the foundations Su 2.2. Laying out the grounds 10 u 
1.2. Brickwork ll u 3.1. Delivery time for the machines 24u 
1.3. Roof construction 4u 3.2. Mounting the machines 3u 
1.4. Interior work 10 u 3.3. Other equipment Su 
2.1. Building the access road ou 


The items in the first group (1.1. to 1.4.) must 
be carried out in succession. The building of 
the access road (2.1.) can start after laying the 
foundations (1.1.) and must be finished before 
mounting the machines (3.2.) (fictitious acti- 
vity @-@), while the interior work (1.4.) fol- 
lows the mounting of the machines (fictitious 
activity @-@) (Fig.). The fictitious activity 
@-@ is mecessary because the mounting of 32 
the machines (3.2.) cannot begin before the 36-10 Network of the example; the critical path 
brickwork (1.2.). is marked in red 


Critical path. In a network one is interested in the duration of the whole process, that is, the time 
between the first event and the last. This can be determined, since there is at least one path along 
the edges from the first event to the last for which the sum of the durations of the activities is a 
maximum. A path of this kind is called a critical path, and the activities on it are called critical 
activities. In Fig. 36-10 the critical path has 37 units of time. Any lengthening of a critical activity 
leads to a lengthening of the total duration, while this is not the case if a non-critical activity is 
lengthened within certain limits. 

Activity times. For each activity in the network there is an earliest and latest starting time and an 
earliest and latest finishing time. The difference between the starting and finishing times is the 
duration of the activity. For critical activities the earliest and latest times coincide. 


Network matrix. To determine the critical path of a network and to obtain the activity times 
one uses a network matrix (Fig.). Its rows and columns correspond to the events of the network, 
and its elements give the number of units of time for the duration of the activities joining the events 
(see the table in the example). To arrive at an algorithm 
for calculation, the successive events and the successive 
numbers must be obtained. If ¢, is the earliest and 7, the 


th er PEE events latest time for the event 7, then: 


to = To, where 0 denotes the first event; 

t. = T,, where e denotes the last event; 

t, = max (t, + ayn), v< n) for all other events, where 

T, = min (T, — @p,y), v > "t dy, is the duration of the 
activity that joins the events 
v and n. 


To calculate the earliest times ¢, one has to work with 
the columns of the network matrix, starting with the first 
event. To calculate t,, for example, one forms the sums 
lo + 4o4-= 0+ 24, t, + a2 = 14+0, t3 + 434= 16+ 0, 

: and takes the largest, t, = 24. To calculate the latest 
nn times 7,, one begins with the last event and works with the 
rows of the matrix. For example, 7; = 27 is the 
36-11 Network matrix minimum of the differences T, — as; = 37 — 8 and 


692 36. Graph theory 


Ts — 4s6 = 27 — 0. To obtain the critical path, one marks on the network those events for 
which ¢, = T, in the network matrix, that is, n = 0, 4,5, 6 and 7. 


Buffer times. In a network it is possible, within certain Onm 
limits, to move or extend non-critical activities, without in- f ~~ Sg =f 4 
creasing the total time of the process, because of buffer times (Fig). | Sp} 
The duration of the activity@—> @ is denoted by dam. By using | SF | dpm 
the total buffer time Sg =Tm — tn — Gnm the buffer times of I oi _ ale 
the preceding and following activities are reduced. If the activity _: | 
is critical, then t, = 7), tn = Ti, and Ty, — ty = Gnm, SOSG = 0. ty Th tn Im 


Using the independent buffer time Sy = max {0, tm — Ty — Gnm} 

has no effect on the buffer times of the preceding and following 36-12 Buffer times 

activities. Using the free buffer time Sy = tm — tn — Qnm Only has 

an effect on the buffer times of the preceding activities, while using the conditional buffer time 
Sp = Ti, — tm Only has an effect on the buffer times of the following activities. Since Sg > S; > Sy >0 
and Sg = Sp + Sz, all the buffer times are zero for the critical activities. 

In the example, the activity 2.2. that leads from @ to @ has buffer times Sg = 13, Sy= 3, 
Sr = 13, Sg = 0. Since Sy = 3, 2.2. can be lengthened by three units of time without setting back 
the completion of building. However, a lengthening of 13 units of time would mean that the path 
© > ®—> © — @® would be critical, and so the activities 1.1. and 2.1. would have to begin at 
their earliest times and could not be lengthened, which would otherwise be possible, since 1.1. 
and 2.1. have conditional buffer times 7 and 10, respectively. 

A manual calculation of the network matrix is possible only for a network with a small number 
of events and activities. For the networks of processes that arise in practice, electronic calculators 
must be used as a rule. 


Special methods in networks. Network techniques are mainly concerned with finding the critical 
path and buffer times of event-oriented networks; the following methods have been developed. 

The critical path method (CPM) uses both event-oriented and activity-oriented networks and 
evaluates only the activities by means of a (deterministically obtained) duration. Its aim is to obtain 
the critical path and to calculate the buffer times by means of a programme of calculation derived 
from the network matrix. 

In the metra-potential method (MPM) both the activities (nodes) and the dependencies (edges) 
are evaluated in an activity-oriented network. For the activities a duration is evaluated, while the 
evaluation of the dependence expresses a coupling distance, for example, the time interval between 
the starting times of the activities joined by the edge. Coupling distances can take negative values. 
According to the relation between the coupling distance and the duration of the activity, one distin- 
guishes between delayed completion (coupling distance greater than the duration of the activity), 
normal completion (coupling distance equal to the duration of the activity) and overlapping comple- 
tion (coupling distance less than the duration of the activity). If normal completion holds in the 
complete network, one has a CPM-network. 

The programme evaluation and research task (PERT) mostly uses an event-oriented network, 
but fixes the durations of activities not deterministically, but by means of stochastic statements. 
For the duration d of an activity there is an optimistic estimate d,, a pessimistic estimate d, and a 
most probable estimate d,,. The duration of the activity is given by d = (d, + d, + 4d,,)/6. Apart 
from the critical path, the expected values and the variance of the times are calculated. 

The methods given up to now assume that the connectivity and dependence in the network, 
which is abstracted from the actual process, are essentially given, that is, these methods work with 
a prescribed topological structure of the network. In the combination network (CNW) the topological 
structure is no longer given, and the determination of an optimal structure is the aim of the method. 
It is now assumed that in the set of activities there is a relation such that A — B means that B must 
be carried out after 4; C ~ D means that C and D must not proceed simultaneously. These con- 
ditions must be formulated for all the activities that occur in the network, and a structure of the 
network is to be found for which the critical path is minimal. 

The above methods are connected with calculations of resources. This implies an understanding 
and consideration of the resources with which the activities are carried out. As resources one counts 
machines, labour and materials, and also the prices and costs of the activities. One distinguishes 
two groups of problems: 1. In optimal distribution of resources, given the time for the whole process 
one is trying to achieve the most equable loading of resources by using the buffer times and splitting 
the activities into sections. 2. In time-optimal distribution of limited resources upper limits are given 
for the availability of resources, and one is trying to make the total duration of the process as small 
as possible allowing for the limitations on the resources. Complete solutions of the resource problem 
are still not known, but quite effective approximation methods have been applied. 

Finally, attempts are being made to produce network optimization algorithms concerned with 
complicated criteria for effectiveness and optimal working. 


37. Potential theory and partial differential equations 693 


37. Potential theory and partial differential equations 


Partial differential equations ............. 693 Potential theory ........ 0. ccc cece ees 693 


Partial differential equations 


Order, linearity, homogeneity. Ordinary differential equations contain only functions of one 
independent variable. By contrast, one speaks of a partial differential equation if the unknown 
function u = u(x,,X2,---,X,) depends on several independent variables x,,x2,...,X, and the 
0 O“u 

Ox; : Ox; Ox; 

highest derivative that appears in the equation determines the order of the equation. The differential 
equation is called /inear if the unknown function and its derivatives occur linearly and are not 
multiplied together. A linear partial differential equation is called homogeneous if it contains no 
term free from the unknown function and its derivatives, otherwise inhomogeneous. For linear 
partial differential equations, as for ordinary ones, the principle of superposition holds: if u, and u, 
are solutions, then every linear combination u = C,u, + C2u2, where C, and C, are constants, 
is also a solution. | 


equation contains partial derivatives etc. for i,j = 1,2,...,m. The order of the 


Partial differential equations of the first order. The integration of a partial differential equation 
of the first order can always be reduced to the integration of a system of ordinary differential 
equations, the characteristic system. For the differential equation F (xo, ..., Xn, Uy Do» «++» Pn) = O, 


) 
where p; = — , this system has the form 
i 


—_ OF ,_ OF OF re» OF 
“= Op; ” an Ox; - du ’ Zo Op; " 


where the x, and p; are regarded as functions of a new parameter ¢ and the dash denotes the derivative 
with respect to t¢. 
If the differential equation does not depend explicitly on u, it can be brought to the form 


7) 7) ; : 
— + A(t, X1, ++) Xn» P15 --+> Pn) = 0, where x9 = 1t, Do = <” and the variables are possibly 


Or 
renumbered. An equation of this form is called a Hamilton-Jacobi differential equation; the function H 
— d 0H 
in it is called the Hamiltonian. The characteristic system then has the canonical form pat ; 
dp, oH dt Opi 


ap ae The motions of point masses of certain mechanical systems are described by 
i 


such equations. In this case, the x, and p; are generalized coordinates of position and impulse, and 
the Hamiltonian H is equal to the total energy (see Chapter 38). 


Partial differential equations of higher order. There is no corresponding closed theory of integration 
for partial differential equations of higher order. Even though the general integral cannot be derived, 
one can often find particular solutions by means of a suitable trial solution in the form of a product 
or a sum of functions, each of which depends only on part of the set of variables: this is called separa- 
tion of the variables. The given differential equation then splits into a number of simpler differential 
equations for these functions. 

The properties of a special /inear partial differential equation of the second order are investigated 
in potential theory. 


Potential theory 


Originally arising from problems in mechanics, potential theory has developed into an independent, 
extensive branch of mathematics. Its results are applied in numerous physical disciplines, particu- 
larly in the treatment of problems in mechanics, electrostatics, magnetostatics, electrodynamics, 
hydrodynamics and thermodynamics. Potential theory has also proved to be fruitful in the develop- 
ment of the theory of ordinary and partial differential equations, complex analysis, the theory of 
conformal mappings and differential geometry. 


The Newtonian potential. The simplest con- 
cept of potential is that discovered by New- 
ton to explain the mutual attraction of mate- 
rial bodies. 

Potential of a point. Newton’s law of gravitation states that two bodies in three-dimensional space 
exert an attraction on each other which is directly proportional to their masses and inversely propor- 
tional to the square of the distance between them. If the bodies are regarded as idealized, so that 


694 37. Potential theory and partial differential equations 


their whole mass is concentrated at a point, the mass-point, say the mass m of one body at a point P 
and the mass u of the other at a point Q, then the above formula is obtained. It becomes Coulomb’s 
law if the masses are replaced by electric charges. Here r is the distance between the points and k 
is a factor of proportionality, for example, the constant of gravitation. To simplify the calculations 
it will be assumed that in the following km = 1. If one supposes that the mass at P is attracted by 
the mass at Q, then the force F is directed from P toQ. If this direction of force makes angles «, B, y 
with the axes of a Cartesian coordinate system, in which P and Q have the coordinates (x, y, z) 
and (&, 7, ¢), then, by the theorems of analytic geometry (Fig.), r = V/[(x — §)? + (y — n)? + (z—8)?], 
cos « = (€ — x)/r, cos B = (yn — y)/r, cosy = (C — z)/r, and so the components X, Y, Z of the 
force F are given by 
X = Foosa = p: (& — x)/r?, Y= FoosB =p‘ (yn -—- y)/r?, Z = Foosy = pw: ( — 2)/r?. 
However, as LAGRANGE discovered in 1773, these three components are the partial derivatives of a 
function U(x, y, z), which Gauss in 1840 called the potential of the mass yu at Q for the point P(x, y, z). 

For if U = p/r = w(x — §)? + (y— 9)? + — 097", 
and if Q is regarded as fixed and P as variable, then 
the partial derivative of U with respect to x is 
0 
SU = (U2) ee — 8 + — 0? 

+ (z — CYP? 2e — 4) = a E — xr = X. 

Similarly the partial derivatives of U with respect to y 


nieuwe = Y, — = Z. 
oy oy 


the denominator vanishes, and the expression /r is not defined for r = 0. 

The value — U(x, y, z) is equal to the potential energy of the system consisting of the two mass 
points. 

Potential of finitely many points. If the mass at P is attracted by finitely many mass-points Q, 
(s = 1, 2, ...,) with masses u,, then the components X, Y, Z of the total force acting at P are the 
sums of the components X,, Y,, Z, of the individual forces: 


X, =e (E— xr, Ye=uss— yr, Ze = Ms GC — 2)/P°; | 
Y=SN, Y=HTY, Z=FZ.,. 
s=1 s=1 s=1 


Similarly the potential U is the sum of the individual potentials, as long as P does not coincide 
with any of the points Q,. 

Potential of a continuously distributed mass. To generalize further, it is natural to abandon the 
abstraction of a mass point and to investigate the attraction that a continuously distributed mass 
exerts at a point P lying outside this mass. One thinks of the mass, which fills a region 7, as divided 


into infinitesimal elements of volume dt = dé dy dé with mass dy and density 9 = & . The element 


of volume at Q(&, 7, ¢) exerts at P(x, y, z) a force of attraction with components dX, dY, dZ. The 
components of the force of attraction of the whole mass in T is obtained by summing over all the 
infinitely many elements of volume, that is, by integrating 
over T (Fig.): 
dX¥=[E—x)/reldu, dY¥=[(— y)/r*]du, 
dZ = [(¢ — z)/r>] du; 
X= SJSte —x)/r*)du, Y= JJIin — yr?) du, 


Z = SIFU — 2)/r7] du. 


y These components are again partial derivatives of a 
potential U, as one can show by partially differentiating 
under the integral sign. 


37-2 Derivation of the Newtonian 
potential 


Differential equation for potentials 695 


Equipotential surfaces. One way of giving a geometrical interpretation of potential is by means 
of equipotential surfaces. The potentials are defined for each point P of three-dimensional space, 
as long as P does not coincide with the attracting mass point or lie on or inside the attracting con- 
tinuous mass. If all the points P at which the potential has the same value a are joined, the equi- 
potential surface U(x, y, z) = a is obtained. By varying a the given equation represents a 1-parameter 
family of surfaces. 

In the case of the point potential U = y/r the family is given by u/r = a. These are obviously 
concentric spheres with centre at Q, whose radii decrease as a increases. Figures 37-3 and 37-4 
represent the family 4/r = a and show how the potential U depends on the distance r. 


37-3 Equipotential surfaces defined by a = u/r 


37-4 Dependence 
| of the point potential 
7 2 f U of the distance r 


Differential equation for potentials. If one partially differentiates 
ir=[(«%-§? +(v— ny? + —-O7r'*”? 
twice with respect to x, one obtains 


Fj) = 1 — 8 + OP +E —FPPP—B 
2 
and —_ (1/r) = 3x — §) + (vy — ny? + @ — 971%? (& — §)? 


— [« — 8? + (y— 9)? + @ — 077-3? = 3x — §)7/r? — I/r’. 
Similarly for the partial derivatives with respect to y and z: 


2 2 
Sr UI) = Hy = MF — UP, Sz UI) = HE — OA — Ur 
If these three partial derivatives are added, the right-hand sides cancel: 


0? 0? 0? 
ol (1/r) + By? (1/r) + 352 (1/r) = 0. 


aa 2 2 
The function u = 1/r satisfies the differential equation eas + iad ce = 0, which was first 

given by Laplace in 1782 and named after him. Ox* © oy" " oz 
This linear homogeneous partial differential equation of the second order is written for short as 


Au = 0. 

Since the differential equation is homogeneous and linear, it remains true if it is multiplied by a 
constant yu. Since A(1/r) = 0, it follows that A(u/r) = 0. The point potential U = y/r is therefore 
a solution of Laplace’s equation. Since the equation A(yu,/r,) = 0 is satisfied for each term in the 


n 
sum >" y,/r,, the differential equation is also 
1 


s= 
satisfied for the sum. Finally, by differentia- 
tion under the integral sign one obtains 


AU= fff Ad/nNe dr =0. 


All three potentials considered up to now are therefore solutions of Laplace’s equation. This 
gives a new and interesting approach to potential theory: one takes Laplace’s equation as starting 
point and calls the solutions potentials. 

The case excluded so far, when P lies inside the attracting mass, leads to a non-homogeneous 
differential equation of the form Au = —4z0, where @ is the density of the mass; this was discovered 


696 37. Potential theory and partial differential equations 


by Poisson in 1813. Moreover, the Laplace operator occurs in further important partial differential 
equations of theoretical physics. Examples are: 
1. the Helmholtz oscillation equation \u+ k?u =0; 
Ou oes ; bee 
2. the heat conduction equation /\u= 2 a which is also applied to diffusion problems; 
1 07u : _ 
3. the wave equation \u = a OP for electromagnetic and water waves, sound transmission 
and oscillations of strings; j2u Ai 
4. the telegraph equation \u = a——> + b-—— + cu for the transmission of electromagnetic waves 
in cables. or or 


The general potential function. Any function U(x, y, z) that is twice continuously differentiable 
with respect to all three variables and satisfies the equation /\ U = 0 in a certain region T of space 
is called a potential function or harmonic function in this region. 


Potential theory is the theory of solutions of the potential equation /\U = 0. 


Rather than collecting all solutions of this equation, it is more interesting to look for common 
properties of all potential functions or to find additional conditions that they satisfy. 


Properties of potential functions. Let T be an open region of three-dimensional space bounded 
by a smooth surface S, and let the volume element and surface element of it be denoted by dt and 
do (Fig.). At every point of S, one marks off the direction perpendicular to S, that of the outward 


normal n. Now if V is any twice continuously differentiable function defined in T and S, and — 


is its partial derivative in the normal direction for all points of S, then, by Gauss’s integral theorem: 


[[rao= [ff aver 
Ss T 


If V is a potential function U, then A U = 0 everywhere in 7, and 


37-5 Surface element and normal 


This statement characterizes potential functions; if it holds on the surface S’ of any region T’ 
lying in 7, then U is a potential function. According to another integral theorem, Green’s theorem, 
for any two twice continuously differentiable functions V and W, defined in T and S, 


[(wx- von )oe= J ffomar— V AW) dt. 


S 
If one chooses W = 1/r, where r is the distance of P from a fixed point Pp, then A W = A(1/r) = 0 
except at P = Py. This point is excluded in the first instance from the region of integration 7, since 
W has a singularity there. If one wishes to admit Po in T, one has to carry out a limiting process, 
which leads to fff (1/r) AV dt = —4nV(Po) for Po € T. If a potential function U is chosen for V, 
one obtains: r 


aS “ = | _— — — 


U is therefore completely determined at every point Po of T as long as the values of the function U 
: . . 9 
and its normal derivative — are known on the boundary S. 
If one takes for S a sphere C with centre at Py and radius R, then 


ar) _ AMP) pa 
: , 


Now r = R = const on C, and by taking account of i f ou do = 0 one obtains: 
c 


Properties of potential functions 697 


The value of the function at the centre of the sphere is always equal to the average of the values of 
the function at points of the surface of the sphere. A potential function cannot therefore have a 
relative maximum or minimum at an interior point of T. 


Boundary value problems. Green’s formula leads to the following question: under what assump- 
tions can one determine the potential function U inside a region T from given values of U and 


a on the boundary S of 7?-— Problems of this kind are called boundary value problems. They 


on 
occur in many branches of physics, for example in electrostatics, hydrodynamics and the theory of 


heat conduction. 
dU ee : 

Now U and a cannot both be chosen arbitrarily on S. If the boundary values of U are given, 
then the function is uniquely determined; the difference (U — U) of two functions with the same 
boundary values is zero over the whole boundary and therefore in the interior, by Gauss’s mean 
value property. 

The problem of determining the potential function in the interior from given boundary values 
of U is called the first boundary value problem of potential theory or the Dirichlet problem, after the 
mathematician who first worked on it. 

The second boundary value problem, or Neumann problem, consists of finding a potential function 


: . . OU : 
that has a given normal derivative an at all boundary points. Naturally the boundary values must 


be prescribed in such a way that the condition i [ a do = 0 is satisfied. 
° 


The third boundary value problem consists of finding solutions of the potential equation for which 


: a: dU . ae . 
a linear combination a + hU, where A is a positive constant, takes prescribed values at the 
boundary points. H 


Simple solutions of the potential equation. Potential functions U in three-dimensional space are 
functions of three independent variables and have the form U = U(x, y, z) in Cartesian coordinates, 
U = U(g, 9, z) in cylindrical coordinates and U = U(r, 8, y) in spherical coordinates. One is often 
interested in solutions for which U can written as a product of three functions of one variable: 
U(x, y, Z) = X(x) Y(y) Z(z) or Ue, y, z) = P(e) Py) Z(z) or U(r, 3, yp) = R(r) OCP) P(—~). In this 
case one obtains from the partial differential equation AU = 0 three ordinary differential equations, 
which can usually be solved directly. This procedure is known as separation of the variables. For 


, . 07U 07U 07U 
example, if U(x, y,z) = X(x) Y(y) Z(z), then the equation AU = ree -+- Fo a apr 0 


qx . 1 @Y . 1 @Z 


1 
; th ; : 1 dx _ i dy 1 i@Z a : 
gives the three differential equations XY ade > ap ae (k2 + /?) 
and so the solution is Uyin(x, y, Z) = e**e%e"" ; m? = —(k? + 17); k, 1, m complex. 


Transformations. Certain mappings of three-dimensional space leave the property of being a 
potential function unaltered. These are inversions with respect to spheres and are called Thomson 
transformations. For example, after inversion with respect to the unit sphere |r| = 1, U(r, 3, 9) 
becomes (1/r) U[(1/r), 8, ~], which is also a potential 
function. The Newtonian potential U = 1/r gives the 
constant potential U = 1, and conversely. 


Potentials in the plane. Two-dimensional potentials 


are solutions of the differential equation A U = 
i 07U 

‘ dy? 

the solution is independent of the third coordinate; for — 
example, the force of attraction of a very long uni- 
form rod in the direction of the z-axis is the same at 
two points P;(x, y, z,;) and P(x, y, Z2), aS long as the 
coordinates z,;, Z2 are small in comparison with the 
length 2Z. For an approximate solution one may as- 
sume that U does not depend on z, and 'start with U 
= U(x, y) (Fig.). 37-6 The potential of a rod of length 2L 


0x? 
= 0. They often arise in physical problems when 


698 38. Calculus of variations 


The solutions of the two-dimensional potential equation are closely connected with complex 
analysis: a function w = u + iv of the complex variable z = x + iy is analytic if and only if its 
real part u(x, y) and its imaginary part v(x, y) satisfy the Cauchy-Riemann differential equations 
Ou dv Ou dv 07u 07v dv 07u u 07u 
ee es , = ; sy = aoa = aa = Oy and B+ s a = O. The 
Ox dy” oy Ox Ox Oy Ox 0x Oy dy? Ox dy 
same holds for v. 


The real and imaginary parts of any analytic function are potential functions. They are said to 
be conjugate. 


Example: From w = In z = In (re'*?) = Inr + ig one obtains two conjugate potentials in the 
plane. They are u(r, y) = In rand v(r, p) = ¢@, or in Cartesian coordinates U(x, y) = In y/(x* + y?), 
V(x, y) = arctan (y/x). 

The potential U = In r, the logarithmic potential, plays the same role in the plane as the Newtonian 
potential U = 1/r does in space. It is the potential of the field of force of an attracting point, only 
the force of attraction is proportional to 1/r instead of 1/r?. 


38. Calculus of variations 


Variation problem without side conditions, Variation problems with side conditions ... 701 
Euler’s differential equation ............. 699 Minimal principles of theoretical physics... 702 
Necessary and sufficient conditions for the Direct methods ...0 0.0. cece 702 
occurrence of an extremum .......0.0006 701 


The methods of the calculus of variations are applied in the solution of many problems of geometry, 
theoretical physics and technology. Questions that led to problems in the calculus of variations had 
already emerged in antiquity, for example, the problem of finding among all plane areas with equal 
perimeters that with the greatest area. ZENODOROS (about 180 B. C.) recognized the isoperimetric 
problem. The same problem confronted the peasant Pakhom in Tolstoy’s story ‘How much earth 
does a man need?’, when the Bashkiri Elder called to him: ‘As much land as you can walk around 
in a day is yours.’ 


Isoperimetric problem: Of all plane figures with equal perimeters (isoperimetric figures) the circle 
has the greatest area (Fig.), and in space, of all bodies with equal surface areas the sphere has the 


maximum volume. 


Oy an 


38-1 Of all figures with equal perimeter the circle includes 
the greatest area 


38-2 The Newtonian problem: Solids of revolution of the same 
effective cross-section generated by curves of equal length 


! 


Newton stumbles on a difficult problem. With the increasing knowledge of natural science and 
mathematics, mathematicians and physicists of the 17th century came up against related but deeper 
questions. NEwTon, in his principal work ‘Philosophiae Naturalis Principia Mathematica’ 
(commonly called the Principia) (1687), calculated the resistances of bodies such as cylinders or 
spheres when falling in a resisting medium. He tried to find that solid of revolution which 
presents the least resistance in falling (with the same speed in the direction of the axis). Under 
otherwise unchanged conditions, for the same length and the same effective cross section the bounding 
curve of the section through the axis is required. Although one can say with certainty that the body 
(a) similar to the hyperboloid of revolution is less suitable than the body (5) consisting of a hemisphere 
and a cone joined together, Newton and his contemporaries were not yet able to give as the solution 
the streamline-shaped body (c) (Fig.). 


Variational problem without side conditions 699 


The brachystochrone problem. Still more fruitful and celebrated was the problem of the brachy- 
stochrone, which was raised publicly by Johann BERNOULLI in 1696: If two points P,; and P, are 
given, at different heights but not lying one above the other, it is required to find among all 
possible curves connecting them, that one along which a material point slides from P, to P, 
under the influence of gravity (neglecting friction) in the shortest possible time. This problem occupied 
at the time the leading mathematicians in the whole of Europe: NEwTon, LEIBNIz, Jakob 
BERNOULLI, L’HOSPITAL, HUDDE, FATIO and others. From then on, the calculus of variations 
developed as a special mathematical discipline. 

In a suitably chosen coordinate system some such curves y = f(x) joining the points P, and P, 
are drawn as possible curves for the fall (Fig.). Because the distance s, the time ¢ and the instantaneous 
speed v at every point of this curve are connected by 


; d 
the relation v = >: because the speed v under the 


acceleration due to gravity g has the value v = |//(2gy), 
and finally because the element of arc ds is the well- 
known function ds = /(1 + y’*) dx of y’ and x, one 
obtains the time 7 required for the motion as a definite — 
integral between the limits x, and x2; from 


dt = [1//(2gy)] ds = [1/V@gy)] VU + y’?) dx 
it follows that 


T = [1/VQ2e)) f VI + »)/y] dx. 


= 


y ylxl=y (x) #enla) 
38-3 The brachystochrone problem 


This integral is to have the smallest value for the required function yo = yo(x), that is, its value is 
larger for all functions y(x) different from yo. By the method sketched in what follows (solution 
of the Euler differential equation) one obtains as solution, with « as a parameter and two constants 
C, and C,, the cycloid 


Xo = (C,/2) (« — sin«)+ C2, yo = (C,/2) (1 — cosa). 


Variation problem without side conditions, Euler’s differential equation 


The investigation of the brachystochrone has led to the problem of finding a function y(x) for 
which the integral of a second function f(x, y, y’) has a smallest or a largest value; the function 
f(x, y, y’) is determined by the geometrical, technological or physical situation and is called the 
basic function. In the brachystochrone problem f(x, y, y’) = V[(1 + y»’?)/y] is the basic function. 
Characterizing the requirement of an extreme value by an exclamation mark, the condition 


Xs 
J = J f(x, y, y’) dx = extreme value! 
*1 


is to hold. The basic function depends on the independent variable x and on the required function 
y(x) and its derivative y’(x). The required function yo(x) is called an extremal. 


The basic problem of the calculus of variations is a maximum or minimum problem, but of a more 
difficult kind than in the differential calculus. It is to find a function for which a certain integral 
assumes a greatest or a smallest value. 


Whereas the BERNOULLI brothers, NEWTON and others solved the brachystochrone problem by 
means of special tricks, EULER, LAGRANGE, WEIERSTRASS, OSTROGRADSKII, CARATHEODORY 
and others were able to develop in the 19th and 20th centuries a method that always leads to a 
solution. 

EULER succeeded in reducing the variation problem to differential equations. He began by as- 
suming that all the admissible functions y(x) must not differ from the required extremal yo(x) at 
the points P; and P,. He thought of y(x) as being always a combination of the extremal yo(x) and 
a variation function e7(x). Then y(x) = yo(x) + en(x) becomes the extremal for the value « = 0 
of the free parameter. At the same time 7(x) must have the value zero at the points P, and P2, 
that is, for the values x, and x2, so that the boundary conditions 7(x,) = (x2) = 0 must hold. 
For y = y(x) the integral J is a function of e, where 


J(e) =f 4(%, Yo + EN, Yo + EN’) dx. 


This function J(e) has an extremum for ¢ = 0 and consequently, by the rules of the differential 
calculus, its derivative must vanish for e = 0. Since the limits of integration are fixed, one may 


700 38. Calculus of variations 


differentiate under the integral sign: 
Je) = J L4G, Yo + €N, Yo + EN’) Nn + F(X, Yo + €N, Yo + en’) n') dx. 


For every function 7(x) that is continuously differentiable between x, and x2 and vanishes at x, 
and X25 


J’(0) = J he. Yo: Yon + fy(X, Yo: Yo) nN’) dx = 0. 


Integrating the second term by parts one obtains 


Xs 


d x 
J |o— 4] 19 ax + Lonoors = 0. 
The right-hand term vanishes because of the boundary conditions. But the relation 


Xs 


d 
[ [6-4-6] ne ax = 0 
1 
is satisfied for all functions 7(x) only if the expression in square brackets vanishes identically, that 
: d of d ; — . 
is, fy — res fy = 0, or ay de Oy 0. If one carries out the total differentiation with respect 
to x one obtains the celebrated Euler differential equation of the calculus of variations. 


This represents an ordinary differential equation of the second order. 


The requirement that a given integral shall assume an extremum for a function is replaced by a 
differential equation for this function. 


The first variation. The variation function is called the variation dy of the function y(x), according 
to a notation introduced by LAGRANGE in 1755: 
dy = en(x); Y=Yo + Oy. 
Because the integral J has an extremum for the extremal, the difference AJ = J(y) — J(yo) can 
never be positive for a maximum, and never negative for a minimum of the integral. For arbitrarily 


small values of ¢ the variation of the value of the integral can be expressed as the differential of the 
function J(e) for e = 0. The product J’(0) « is called the first variation of J and is denoted by OJ: 


a 
, d 
a1 = JOE = ty dv + [ (— Ff) by ae, 
*1 
The necessary condition for the existence of an extremum of the integral that was found in the 
derivation of the Euler differential equation can thus be expressed by saying that the first variation 
must vanish: 6J = 0. 


Generalizations. The basic function f(x, y, y’), and consequently the integral J, can depend on 
several (finitely many) functions y,(x), y2(x), ..-, y,(x) and their derivatives, instead of on one func- 
tion y(x). In place of one Euler differential equation there is then a system of n differential equations. 
On the other hand, with one function y(x), higher derivatives of y(x) can occur in the basic function, 
besides y’(x) also y’(x), ..., y(x). The Euler differential equation is then of order 2n. Finally, one 
can enquire into the extremal properties of surfaces in space. A soap film stretched across a closed, 
non-plane loop of wire, for example, always assumes the smallest possible surface area because 
of the surface tension (see Plate 55). Such surfaces with minimum area are called minimal surfaces. 
The integral J is then a double integral, x and y are independent variables in the basic function, 
and the function z = z(x, y) determining the area is required: 


[J f@, y, Z, 2x, Zy) dx dy = extreme value! 


As a condition for the occurrence of an extremum one obtains the Ostrogradskii differential equation, 
a partial differential equation of the second order: 


Variation problems with side conditions 701 


Necessary and sufficient conditions for the occurrence of an extremum 


Under the assumption that an extremum exists, the first derivative J’(0) must vanish and the 
Euler differential equation must hold. Such a condition, which must be satisfied for every solution, 
is called a necessary condition. Whether a solution exists cannot be decided by a necessary condition 
alone. WEIERSTRASS was able to find sufficient conditions for the existence of an extremum by means 
of arguments that were new in principle. It is not possible here, however, to go into this, nor into 
the proof of an existence theorem, which shows that a solution exists at all. In many cases one is 
content to investigate subsequently whether the solution of the Euler differential equation really 
has extremal properties under the given geometrical, technological or physical conditions; for 
example, it is intuitively obvious that between two points of the plane there is no shortest, but 
non-rectilinear, connecting curve, because for every connecting line one can imagine a shorter 
one, provided that the straight line is excluded by hypothesis (Fig.). 


38-4 Between two points P, and 
P, there is no shortest non-rectilinear 
connecting curve 


38-5 The area bounded by the line s¢gment P,P, and a curve 
of prescribed length / between these points is greatest when the 
curve is a circular arc 


Variation problems with side conditions 


xs 
Frequently variation problems occur in which not only the integral f f(x, y, y’) dx must be made 


. xy 
an extremum, but additional conditions must also be satisfied. Such conditions are called side 
conditions. 


x 
Isoperimetric problem. If one wishes the area { y dx, x2 > x,, under the required curve y(x) 
X; Xe: 
to be an extremum, then, for example, the arc length f /(1 + y’?) dx = / may be prescribed, where 


x1 
/ cannot be less than x2. — x,, / > x. — x,. The solution of this problem of enclosing the greatest 
possible area by a given perimeter, which consists here of the line segment P,P, and of / (Fig.), 
is a segment of a circle. 
In general, in the isoperimetric problem, together with the requirement if I(x, y, y’) dx = extreme 


value! there occurs a side condition in integral form, requiring that the integral of a function g(x, y, y’) 
depending on y and y’ shall have a fixed, prescribed value a: f ‘2(X, yy) dx =a. 


xy 
This general isoperimetric problem can be reduced to a variation problem without side conditions 
by the method of undetermined multipliers developed by LAGRANGE. From the given functions 
f(x, y, y’) and g(x, y, y’) one forms with a constant multiplier A the extended basic function 
h(x, y, »’) =f (x, y, ¥’) + Ag(x, y, y’) and solves the Euler differential equation for this function: 


d 

hy — dx hy: = 0. 

In the example of the prescribed arc length / one puts h(x, y,¥) = y+AVd + y’”) and obtains 
by integrating the Euler differential equation (x — &)? + (y — n)* = A?; thus, the extremals are 
indeed circular arcs of radius ||. 


Side conditions in equation form. In the isoperimetric problem, in spite of the formally complicated 
exterior form, there is basically only one prescribed number, for example a length. A side condition 
in the form of an equation, however, which may well appear simpler, leaves open many more pos- 
sibilities for the required extremals, for example, the arbitrary path of the given curve on a surface. 
How many possible ways are there of arriving at one point from another on the surface of a sphere 
alone! 

The geodesic lines on a surface are the shortest curves connecting two points of the surface. If 
one thinks of the coordinates of the surface point x(t), y(t), z(t) as depending on a parameter ¢ 
( dx . dy . dz 


bs 
apo ap a 2) , then the arc length, that is, the integral [ /(x? + y? + 27) dt, has 
hy 


702 38. Calculus of variations 


to be made a minimum. At the same time, however, it must be guaranteed that the curves really 
lie on the surface; the coordinates x, y, z must satisfy the equation g(x, y, z)=g[x(t), y(t), z(t)] =0 
of the surface as a side condition. 


Minimal principles of theoretical physics 


It was noticed relatively early by mathematicians and physicists that a light ray travels from one 
point to another in space along a path that requires a shorter time to traverse than any neighbouring 
path (see Chapter 19.4. — Extreme values of functions). If one makes this principle of Fermat the 
basis of geometrical optics, the laws of refraction and reflection, for example, can be derived 
deductively from it. 

Such minimal principles were interpreted teleologically in the 18th century; indeed, MAUPERTUIS 
even tried to construct a proof of the existence of God from his principle of least action! VOLTAIRE 
with his ironic story of Dr. Akakia (1752) exposed Maupertuis to the ridicule of Europe and so 
also demolished the idea of teleology. It turned out that the light path could also on occasion lead 
to a maximum time and—- what was more important — that the variational principles of mechanics 
could be reduced to differential equations, which no longer sounded or looked teleological. This 
procedure, which was followed by LAGRANGE, GAUSS, HAMILTON and JACOBI will now be indicated 
using the modern nomenclature of mathematical physics. 

The motion of a point-mass, for example, in the gravitational field of the earth, or of a charged 
particle in electric or magnetic fields, is determined not only by its instantaneous velocity, which 
depends on the external forces, but also on potentials. It thus depends not only on the kinetic energy T, 
but also on the potential energy U. According to LAGRANGE one considers the Lagrangian function 
L = T— U asa function of the time ¢, and of the space coordinates x, y, z and their derivatives 
x, y, z. If one has not a single point-mass, but a system of N point-masses, then L is a function of 
the time ¢ and of 3N coordinates and 3N velocity components. For various physical problems 
generalized coordinates q,, k = 1, 2, ..., 3N, are introduced so that L = L(t, q,, %), K = 1, 2, ..., 3N, 
is a function of these coordinates. The motion of the point-masses follows from the Lagrangian 
equations of motion of the second kind 

ee, SN 

Og, dt Og, 
These equations can be derived from Hamilton’s principle. Among all imaginable conditions under 
which the system could change during the time interval ¢, — ¢, from a state 1 which it occupied 
at ome t, to a given state 2, the motion that actually occurs is that for which the integral 


2 = {ut Gx» 4%) At is the smallest. This most important integral principle of classical mechanics 


jeads i in this way to a variation problem, and the Euler differential equations belonging to it are the 
Lagrangian equations of motion of the second kind. 


Direct methods 


Elegant as the methods of the calculus of variations sketched above appear to be, considerable 
difficulties can stand in the way of their practical application. In particular, for many problems the 
exact solution of the Euler differential equation is difficult or not possible at all. For this reason 
approximation methods have been developed which, because they circumvent the Euler differential 
equation, are called direct methods of the calculus of variations. 


Xs 
The method of Ritz (1909). For J = { f(x, y, y’)dx = extreme value! one assumes for the required 
function »(x) the approximation X21 
= C19 (x) Sgn vital CrPr(X), 


where the ;,(x) must satisfy the boundary conditions. The problem consists in determining the 
constant coefficients c;. One substitutes for y in J and obtains J(c,, ..., C,) = extreme value! The 


: ee oJ : 
c; are given by the necessary conditions 30 = 0, i= 1,...,n, for the occurrence of an extremum. 
i 


! 
Example: if J (y’? — y? — 2xy) dx = extreme value! is to hold, where the solution satisfies 


the boundary conditions »(0) = »(1) =0, one makes the assumption, for example, that 
p(x) = x(1 — x) and (x) = x7(1 — x). One then obtains the approximate solution 

= —7x*/41 — 8x*/369 + 71x/369. As a check one finds in this example, from the Euler differential 
equation, the solution y = sin x/sin | — x. The differences between the exact and the approximate 
solution are only of the order of magnitude 10~*. 


39. Integral equations 703 
39. Integral equations 


An equation serving to determine a function is called an integral equation if the required function 
occurs in the integrand of an integral. A very simple example is the equation 


a) fy) dt=fs) -f@, a<s<b. 


af) 
dt — 


The function f(s) is given and it is required to find the function y(t). Clearly the solution is y(t) = 


39-1 Plucked string 


Integral equations frequently arise in the mathematical treatment of physical or technological 
problems; to this category there belong problems concerning elastic bending and oscillations (building 
of bridges) and concerning heat propagation processes. As an example consider a stretched string 
of length /, described by the interval 0 < s < /. At the point with the coordinate ¢ it is loaded with 
a force 1. Let the deflected string be represented by the function E(s, t); in Fig. 39-la (broken 
line) the deflected curve, for example, h(s) = E(s, 2/3) for t = 2/3 is given. If the force is of magnitude 
y, the deflection is h(s) = E(s, t) y (continuous line). If two forces y, and y, act at the points ¢, 
and ft,, then a total deflection A(s) = E(s, t,) y; + E(s, t2) y2 arises which is composed of the 
individual deflections E(s, 1/4) - 0.9 (dotted line) and E(s, 2/3) - 0.6 (broken line) shown in Fig. 39-1b. 
Similarly, a loading with n forces y;, ¥2, ---, ¥, at the points ¢,, f2, ..., ¢, leads to the deflection 


h(s) = E(s, t1) 1 + E(s, t2) v2 ++: + EG, ty) Yn 
(see Fig. 39-1c). In particular, the deflection at the Ath point of application s = ¢, has the value 
(2) hy, = A(ty) = E(ty, ty) Wp Hee + E(tys tn) Yn 


or h, = 2 E(ty, ti) yi (k = 1, 2,..., 7). 

For a continuously. distributed force acting along the whole string one obtains similarly 
! 

(3) h(s) = J E(s, t) y(t) dt. 


The function y(t) is the force density or force per unit length, so that y dt is the differential force 
acting on the element of length dt; correspondingly the summation sign in (2) goes over into the 
integral of (3). The deflected string forms a ‘smooth’ curve h = h(s) (see Fig. 39-1d). 

Conversely, if the form of the deflected string, that is, the function A(s), is known and the loading 
y(t) of the string is to be found, then the relation (3) becomes a /inear integral equation of the first 
kind for the required function y(t). The function E(s, t) of two variables is called the kernel of the 
integral equation. To solve the integral equation one can proceed the opposite way round: the equa- 
tion (3) is approximated by the equation (2), that is, the integral is approximated by a finite sum with 
the desired accuracy, in which one imagines the force density y(t) replaced by sufficiently many 
individual forces y;. The calculation of the y,; in (2) is, however, nothing but the solution of a sys- 
tem of linear equations, with n equations in 2 unknowns y,. 

In fact, the theory of linear integral equations has much in common with that of systems of linear 
equations; one could regard integral equations as linear equations with infinitely many unknowns. 


7104 39. Integral equations 


FREDHOLM used this connection when he developed (about 1900) the first general theory. However, 
he considered a somewhat different type, the so-called linear integral equations of the second 
kind or Fredholm integral equations. These occur frequently, although mostly indirectly as the result 
of the rearrangement of differential equations; for example, the forced harmonic oscillation of 
a string is described by a boundary value problem for a differential equation of the second order: 


(4) y’+dy=f(s), yO=yD=0. 
It is required to find the deflection y = y(s) of the string at the point s in a particular phase of the 
oscillation; the constant A is given by the frequency of the oscillation, and the function f(s) by the 


external force acting with the same frequency and in the same phase. For the rearrangement one 
substitutes f(t) — Ay(t) for y” from (4) in the Taylor expansion 


ys) = (0) + sy(0) + tc — thy(t) dt. 


By writing down this expansion for the special case s = / and using the boundary conditions 
y(0) = y(1) = 0, y’(O) can be eliminated; one then obtains the equation 


1 
(5) y(s)—A J K(s, t) y(t) dt = A(s), 


a linear integral equation of the second kind for y(s). The kernel K is like the above influence function 
E(s, t); h(s) is calculated from f(s) and, in particular, is identically equal to zero if no external forces 
act (f = 0: free oscillation). Additional boundary conditions no longer occur. 

Integral equations of the second kind have been investigated particularly thoroughly, above all 
in the work of SCHMIDT. Some important properties of this type of equation will now be discussed. 


The Fredholm alternative. For a linear integral equation of the second kind. either there exists 
a uniguely determined solution y(s) for every given function A(s) on the right-hand side, or solutions 
exist for only certain right-hand sides, but then always infinitely many. 


The two cases are distinguished according to whether the homogeneous equation 
I 
(6) H(s)—A J K(s, t) #(t) dt = 0 


belonging to (5) has only the ¢rivial solution » = 0, or has non-trivial, so-called null solutions j(s). 
The physical significance of the first case » = 0 in the example considered is that no free oscillations, 
called characteristic oscillations, are possible for the frequency determined by A. Then there always 
exists a well-defined form of oscillation, no matter how the external forces are distributed. In the 
second case, with null solutions, on the other hand, characteristic oscillations exist. For a given 
force distribution A(s) an oscillation y(s) of the string can then exist, but an arbitrary charac- 
teristic oscillation ¥(s) can be superimposed on this: y(s) + ¥(s) is also a solution of the problem. 
However, there need not be any solution at all; this case occurs physically if the applied force is 
chosen in such a way that it ‘builds up’ a characteristic oscillation (resonance) and theoretically leads 
to infinitely large deflections of the string. 


Eigenvalues. In general, those values of the parameter A for which (6) has non-trivial solutions 
are relatively rare exceptions. They are called eigenvalues and the corresponding null solutions 
are called eigenfunctions; for example, for the oscillation of the string the eigenvalues are the numbers 
An = (x?/I*) +n? (n = 1, 2,...) and the corresponding eigenfunctions are y, = sin(s: /A,). Eigen- 
values and eigenfunctions play a significant role in the theory and practice of integral equations; 
for example, series expansions of given functions in terms of eigenfunctions are important aids in 
the solution of differential and integral equations: the well-known Fourier series belong to this 
category. 


The resolvent. If (5) has a unique solution, then the solution y(s) can be represented with the 
help of the solving kernel or the resolvent I'(s, t): 


(7) his) + A [T%, t) h(t) dt = y(s). 


For sufficiently small A the resolvent can be calculated by an iteration procedure. For this purpose 
one substitutes for y(t) under the integral sign in (5) the value 


I 
h(t) + Af K(t,r) yr) dr 
0 
given by (5) itself; this leads to an equation of the form 


ys) — 7 f K,(s, r) y(r) dr = A(s) + ii K(s, t) h(t) dt. 
0 


40. Functional analysis 705 


In this the y under the integral sign is again replaced by the expression given above, and so on. 
Finally comparison with (7) yields the expansion 

(8) ICs, t)= K(s, t) + AK2(s, t) + A*K3(s, tye, 

the so-called Neumann series. The iteration kernels K,, K3, ... are calculated from K by repeated 
integration. 


Other types of equation. Besides the equations (3) and (5) there are also linear integral equations 
of the third kind: 


1 
(9) a(s) y(s) — AJ K(s, t) y(t) dt = A(s), 


in which g(s) and h(s) are given functions. Integral equations of the first and second kinds are special 
cases of this type; they arise from (9) if g(s) is a constant. 

Further there is the extensive field of non-linear integral equations. Here there is no general 
theory yet. In these equations the required function y does not occur under the integral sign as a 
simple factor, but in a much more general, and usually more complicated, way. For example, 


I 
(10) x(s) — J g(s, t) [y(t)]* dt = h(s) 


is a non-linear integral equation. 


40. Functional analysis 


AUDSIVOCE SPACES nics awaawset en AG awe wew ad 705 The use of functional-analytical methods in 
ODCTOIONS 6 ch eden we oe ee iwieke ceed 708 approximation theory .........ccc cece eee 711 


Functional analysis has been developed essentially in the last 40 years. Its beginnings lie in the 
recognition that widely different kinds of mathematical operations, from the basic operations of 
arithmetic to differentiation and integration, have strikingly many features in common and that 
the mathematical objects subjected to these operations exhibit the same or similar properties in 
relation to the operations, although they come from quite different fields of mathematics. The same 
rules of addition hold for addition of angles, of numbers, of vectors and so on. In this sense functional 
analysis formed originally a cross-section of certain branches of analysis, for example, of the theory 
of integral equations, of the calculus of variations, and of linear algebra. 

The search for and recognition of such deep-seated common properties, the striving for the most 
general possible statements, which are independent of special mathematical objects and are deter- 
mined only by abstract relations, have led to numerous new concepts, which have become the basis 
of functional analysis and are frequently used in modern mathematics. 


Abstract spaces 


Deviating from the usage of everyday language, the concept of space in functional analysis bears 
no direct relation to geometry or even to the space of our experience. Because of certain similarities 
to geometry, especially to analytical geometry and linear algebra, the word space was carried over 
to objects of functional analysis. Similarly other concepts also, such as distance or length, taken 
from the vocabulary of analytical geometry, have lost their original geometrical meaning. 


The concept of an abstract space. In functional analysis a set of elements is called an abstract 
space if a limiting process is defined within the set, if the statement ‘a sequence x,, x2, X3,... of 
elements of the space tends to a limit x = lim x, has a well-defined meaning. 

n—» oO 
Example J: The elements of the k-dimensional Euclidean space R* are the ordered k-tuples of 

real numbers x = (&,, &2,...,&,). A sequence of elements x, = (&{™, €$,..., 6), 2 = 1, 2,..., 

tends to the element x = (&,,&2,...,&,) if for each i= 1,2,...,k the number sequence {&{"} 

tends to the corresponding &, as n> oo. For k = 3, when R® consists of the totality of triples 

x = (€,,&,,&3) of real numbers, the £,,&,,&3 can be regarded as coordinates and x as a point 

in the sense of solid analytical geometry. 

Example 2: The space of polynomials of degree at most m in a variable t has the elements 

x = x(t)=a4g + «,f+ 04,17 + --- +a, ¢™, where the coefficients «9, «1, .--,%, denote complex 

numbers. The set of these polynomials represents a space, once a concept of limit is defined in it. 


706 40. Functional analysis 


A polynomial of degree m is determined uniquely by the value of the function and the values 
of its first m derivatives at a point f = fg according to Taylor’s formula: 
x" (to) x™(to) (4 _ 
2! m! 


x(t) = x(to) + x'(to) (f — fo) + (t — fo)? +--+ fo)”. 

One therefore defines: A sequence of polynomials x, = x,(f) is convergent to the polynomial 
x = x(t) if the values of the function x, and its derivatives converge individually to the values of 
the function x(f,) and of its derivatives x’(fo),...,x" (fo) at the point t= fg, that is, if 
lim Xq(to) = x(to), lim x_(to) = x“(to), --+5 lim xf (to) = x(t). 


is oo i Oo A= oo 


Linear spaces. In linear spaces a multiplication of the elements x, y, z,... by real or complex 
numbers A, 4, ... and an addition of any pair of elements of the space must be defined. To each number 
A and each element x of the space there corresponds a unique element denoted by Ax, and likewise 
to each pair of elements (x, y) a unique element x + y of the space. It is, however, left completely 
undetermined what these multiplications and additions look like in special cases; they must simply 
satisfy the following conditions: 


Examples in which all the axioms are satisfied: 
1. The k-dimensional Euclidean space R* becomes a linear space if multiplication is defined by 


Ax = ME, , G2, -++s Ee) = (Abi, Aba, ---, AGL) 


and addition of two elements x = (&,,&2,.-.,&) and y—(,72,.--,%) by x+y 
= (€, + ,,&2 + m2, ---» & + ny). The zero element is O = (0, 0, ..., 0). 
2. The space of polynomials of degree at most m, becomes a linear space by the definitions 
Ax = dog + Axytt +++ Ad_t™ and x+ y= x(t) + Wt) =(«o + Bo) + (614+ 81) t+ + (%mt+Bm) t™. 
The zero element is the polynomial O = O(r) = 0. 


The concept of a metric and of a metric space. Two points P and Q of the three-dimensional 
geometrical space have a certain distance from one another, which is different from zero if they 
do not coincide; this distance is measured by the length |PQ| of the line 
segment joining P and Q, where |PQ| > 0 for P +@Q. The distance between R 
P and Q is equal to the distance between Q and P, that is |PQ| = |QP\. If 
one takes a third point R not on the line passing through P and Q, then 
one obtains a triangle POR. By a theorem of elementary geometry, any side 
of a triangle, say PQ, is less than the sum of the other two sides PR and 
QR, that is, |PQ| < |PR| + |QR|. This relation is called the triangle inequa- ? Q 
lity. In the form |PQ| < |PR| + |QR| this relation holds without limitation, 40-1 The triangle 
Ag when P,Q and R no longer form a proper triangle, but lie on a straight —_ inequality 
ine (Fig.). 

In analysis too one often has to measure distances, figuratively speaking, between the elements 
xX, y, Z, ... considered, to decide whether two elements x, y are at a ‘great’ or ‘small’ distance from 
one another. In order to measure distances, a distance function or metric must be defined, that is, 
a real-valued function d(x, y) > 0 defined for all pairs of elements x, y. 


In this the analytical form of the distance function remains completely open. A space for whose 
pairs of elements a distance function is defined is called a metric space. The limiting process lim x, = x, 
which is assumed in an abstract space, is defined by lim d(x, x,) = 0. n+ a 


nm—> OO 


The concept of a metric and of a metric space 707 


Examples of metrics in the k-dimensional Euclidean space: 


| k . = * . 2 
1. d(x, y) = | 2 (é,— n?| is a generalization of the formula for the length in analytical geo- 
metry; im 


k 
2. d(x, y) Ee Aten E; : mil 3. d(x, y) = 2 le; ae mil. 


For each of these metrics it has to be shown that it satisfies the three given axioms. This is quite 
easy; only for the first metric it is a little difficult to establish the validity of the triangle inequality. 


Normed spaces. It is well known that to every complex number ¢ = & + in there corresponds 


the non-negative real number |¢| = //(é? + 77) as absolute value or modulus of ¢. In working with 
functions, vectors, matrices and so on, the form of the problem often suggests a way of ascribing 
a non-negative real number to each of the objects considered as a measure of its ‘magnitude’. 
Such a numerical measure associated with the elements x, y, ... of a space is called a norm and is 
denoted by ||x/|, || ||, ..., provided that it has the following properties. 


These properties hold, for example, for the abcoluie value or edulis of a annie mee 
A space to whose elements a norm is ascribed is called a normed space. 


From a norm a metric can be derived, by defining a distance function d(x, y) between two elements 
x, y to be the norm of their difference: d(x, y) = ||x — y). 


In the k-dimensional Euclidean space R* the following norms have all the required properties: 


k k 
1. ball = Y/(2e2)s 2 xl = max les 3. el = 2 lel 


From these the metrics on R* in the examples above are obtained. It often depends on the purpose 
of the investigation, which definition of the norm is appropriate to use for a particular space. 


Complete metric spaces. If a sequence x,, x2,... of elements of a metric space X converges to 
an element x, then the distances d(x, , x), d(x2, x), ... form a null sequence. By the triangle inequality 
the distances d(x;, x,) between two arbitrary elements x;, x, of the sequence also tend to zero as the 
indices i and k increase; they form a Cauchy sequence. 


A sequence {x,} is called a Cauchy sequence if for every positive number ¢ an index v(e) can 
be found such that d(x,, x,) < e for all indices i, k > n(e). 


Every convergent sequence is a Cauchy sequence, but not every Cauchy sequence is a convergent 
sequence. Spaces in which a limit element x can be found for every Cauchy sequence {x,} are said 
to be complete, and complete normed linear spaces are called Banach spaces, after Stefan BANACH 
(1892-1945), who was one of the founders of functional analysis. All finite-dimensional spaces, 
for example, the space of polynomials of degree at most m, are complete. The space L, (a, b) (see 
Hilbert spaces) is likewise complete. In general, one requires a scalar product space to be complete 
before designating it as a Hilbert space. 


Hilbert spaces. The spaces, called after David HILBERT (1862-1943), are important special cases 
of normed linear spaces. In these, for every pair of elements x, y a complex-valued function (x, y), 
called a scalar product, is defined, having the following properties (the bars above denote the con- 
jugate complex number): 


() Gy = 0,4); 

(2) (Ax, y) = A(x, y), where A is an arbitrary complex number; 
(3) (x, x) > O, with equality if and only if x = O; 

(4) (x + y, Z) = (x, z) + (y, 2). 


The space is normed; a norm is introduced with the help of the scalar product by the equation 


|x| = Vy x). 
Examples of Hilbert spaces: 1. The space C* of complex k-tuples with the scalar product 
— y) = By?) and the corresponding norm ||x|| = V( 2 a ri?) (€;, 4; complex). 
. The apace L3(a, 6). Its elements consist of coniplen -valued functions x = x(t), SLs in 
a<=f=<)b, for which the integral { |x(t)|? dt exists. The scalar product defined by (x,y) =|x( yar 


has all the requisite properties. 


708 40. Functional analysis 


Example of functional-analytical arguments. The insight that can be gained through functional- 
analytical concepts is shown by some of the consequences of the Schwarz inequality. 


The proof makes use first of all of the third property of the scalar product, according to which 
(x + Ay, x + Ay) > 0 for every arbitrary complex number A. Using the other properties one obtains 


(x + Ay, x + Ay) = (Xx, (x, x + Ay) + 40, x + Ay) = @& + Ay, x) + Aw + Ay, y) 
= (x, x) + A(y, x) + AG, y y) + AAy, y) = (x, x) + A(x, y) + AL, D) + AO, y) > O. 
This holds, in particular, for A = —(x, y)/(y, y); consequently 
(x, x) — & ») & WO.) -— & YO, 0/0. + YO WIO,Y) = 
From this it follows that 
(x, y) ©, x) = (x, y) (%, Y) = I(x, YI? < |lx/I? - llyll?, as required. 


The usefulness of this general result lies in the following. Once one recognizes that a numerical 
Operation on arbitrary pairs of elements x and y of a space satisfies the conditions required of a 
scalar product for Hilbert spaces, then from the validity of the Schwarz inequality for this special 
space one immediately obtains important relations; for example, the inequalities 


These are relations which, besides a number of similar formulae, had been found earlier and in- 
dependently for the individual spaces by CAUCHY, BUNYAKOVSKII, SCHWARZ, and others. By intro- 
ducing functional-analytical concepts in this way essential properties common to different branches 
of analysis are discovered and worked out. 

Similarly many relations already known are obtained quite simply as interpretations of a theorem 
of functional analysis; for example, the triangle wie ead We = follows uu the Schwarz 


inequality. Because ||x|| = (x, x), |x + yl? =(x + y =(x,xy+(ix%yn+0,x0+0, ¥) 
< (oles Il? + lo) + O01 < ef? + [yl]? + one bl = = (||x|| + yh? and the result 
is prove 


For the spaces R* and L,(a, b) this yields the Cauchy and the Minkowski inequalities, which were 
known earlier. 


Operators 


Whilst by means of the space concept the objects of a mathematical investigation are essentially 
only typified, an operator characterizes a definite mathematical operation that can be performed 
on the elements of the space. Almost every mathematical operation can be regarded as a correspon- 
dence determined by a definite rule of calculation, mapping every element x of an abstract space X 
uniquely to an element y of a space Y, which may, but need not, be different from Y. The correspon- 
dence is also called a mapping of X into Y, and the law of correspondence is called an operator 
A, B, ... or F; the correspondence is written in the form y = Ax or y= = A(x) (Fig.). 


eg real. functions F of a real variable x are special operators; 
they map the space of the real EEE R! or a subspace X of 
it into Y= R', 


If one assigns to each polynomial sts) of the space X of 
polynomials of degree at most m the polynomial 


y= Ax=x'(t) — 3x(t) — ax(P), 


: : : 40-2 Illustration of an operator; 
then y = Ax 1s a mapping from X into the space Y of polyno- mapping from X into Y by A and 


mials of degree at most 2m. from Y into Z by C 


Operators 709 


Linear operators. As far as applications are concerned, these operators form the most important 
class. They are defined by the properties (1) A(Ax) = AAx for every arbitrary number /, and 
(2) A(x + y) = Ax + Ay. As an example, the operator given above is a linear operator if « = 0, 
and is otherwise non-linear. 

Composition of operators. If A and B are two operators, each being a mapping from X into Y, 
and A is an arbitrary number, then by the product AA one understands the operator that maps x 
into A(Ax), so that (AA) x = A(Ax); the sum A + B, on the other hand, maps x into Ax -+ Bx, 
so that (4 + B)x = Ax + Bx. 

Finally, if a third operator C is given mapping the space Y into a space Z, then the operator CA 
maps each element x into that element C(Ax) in Z into which x goes over by successive applications 
of the operators A and C; this is expressed by the formula (CA) x = C(Ax). 


Bounded linear operators. A linear operator A mapping a normed linear space X into a normed 
linear space Y is said to be bounded if an inequality of the form 


|y| = |Ax| < Kix] with K>O0 
holds for all x in X; the smallest number K having this property is called the norm of the operator A 
and is denoted by ||A||. It goes without saying that the norm ||y|| in the space Y may be different 
from the norm |/x|| in X. 


Example: Let X be R° with the norm ||x| = max |é;| and let Y be R* with the norm 
|| || = max |,| for i = 1, 2. By the equations 7, = a,,&, + 4,262 + a13&3, M2 = 42181 + G2282 
+ d2363, a certain y = (74,;,72) is made to correspond to each x = (&,, &,,&3). The operator 
defined by this is bounded, because 


3 
ni = CAL ; F,| Sp |a;2| 3 [E2| aie |a;3\ ; [3| Ss atx |x|], 
that is, es 


3 
| yl] = max || << (max J |aiy|) |x|. 
i=1,2 i=1,2 k=l 


3 
A closer investigation shows that max >"|a,,| is precisely the smallest number K, and therefore 
f=1,2 k=] 
represents the norm of the operator A. 


With the definitions given above of the addition of operators, the multiplication of an operator 
by a number, and the norm of an operator, the totality of bounded linear operators A, B, ..., that 
map a space X into a space Y, themselves form a normed linear space. This is of unusual signifi- 
cance for functional analysis and the application of functional-analytical methods. The circle of 
consideration again closes as it were: classes of operators, which as intermediaries between two 
spaces appear to stand outside the theory of spaces, themselves fall into the category of abstract 
spaces. 

Functionals. Among the mappings of a space the numerical functions occupy a special place; 
these are mappings into the set of the real or of the complex numbers. They are called functionals 
and have given their name to functional analysis. 

In normed linear spaces the norm, for example, is already a functional. For the sake of simplicity 
only normed linear spaces will be considered in the following. 

An exceptional place is again occupied by the /inear functionals f, which assign to every element 
x, y,... Of the space X a real or complex number f(x), f(y), ..., so that the linearity conditions 
fix +y) =f) + f(y), f(ax) = af(x) hold for all elements x, y of the space X and all admissible 
real or complex numbers «. A linear functional is said to be bounded, or also continuous, if the 
norm ||f|| of f satisfies the condition || f|| = sup (|£(x)|/||x|]) < co (where x + 0). 

xe 


The totality of all continuous linear functionals defined on X forms the dual space X*. If X is a 
normed linear space, this is likewise a normed linear space. 

An important problem of functional analysis consists of determining the properties of continuous 
linear functionals, or of representing them and their values f(x), x € X, as a sum or an integral, 
and of characterizing sets and mappings of the original space X by elements and mappings of 
elements of the dual space X*. From the point of view of this problem functional analysis is a 
further development of a geometrical discipline, /inear geometry. 

The theory of continuous linear functionals plays a significant role, for example, in the theory 
of linear operator equations or integral equations, in the theory of approximate integration, in the 
theory of distributions or generalized functions, and in the theory of the Lagrange method of undeter- 
mined multipliers. Here are some examples of results for specific spaces. 

I. In R*, the space of k-tuples x = (x,, ---, X,) of real numbers x;, i= 1,.--,k, corresponding 
‘to ae linear functional f there exist k real numbers fy, ...,f; such that the value f (x) ofthe iets 
can be represented in the form f(x) = fi xy +hr°x2 yee + fh, * Xx: 


710 40. Functional analysis 


One speaks of a representation of f by means of this relation. Conversely, corresponding to each 
arbitrary k-tuple of real numbers f,, ...,f,, a continuous linear functional can be defined in this 
way. Depending on the norm (see Normed spaces) by which the elements x € R* are normed, one 
finds: 


k 
1 Ith=Y( 252), 21 =Z Al or 3. IF] =, max, fi 


2. In the space L.(a, 5) of all functions of Lebesgue integrable square over the interval [a, 4], 
the Riesz representation theorem states: Corresponding to each continuous linear functional f there 
_exists a uniquely determined function g in L,(a, b) such that the value f(x) of the functional for 

b 


x €L,(a, b) can be represented in the form f(x) = [ x(t) 2(t) dt = (x, g). 


More generally, in every (complete) Hilbert space X the values of the functional f(x) can be 
represented as a scalar product (x, g). Conversely, by means of an arbitrary element g € X one can 
define a continuous linear functional f(x) = (x, g). It can be shown, moreover, that the norm || f|| 
of the functional f is equal to the norm ||g|| of the generating element. 

3. The space X of all polynomials i in one variable of degree at most m can be regarded as a normed 


linear space with the norm ||x|| = D3 |x(to)|. 


A linear functional fon X must arena to the polynomials 1, ¢, t?, ..., ¢" certain complex numerical 
values fo, f;,f2, ---» fm and to the polynomial x with the function values 
eto) ym 


x(t) = x(to) + x'(to) t +--+ + ——— 
the numerical values 
aac 


f(x) = x(to) fo + x(t) A +° oe ——— fin 


Conversely, by means of this relation with aie numbers fo, ---, fr one can define a linear 
functional. This is also continuous, because || f || = max am fil/i). 
i= 


4. Hyperplanes. A linear equation in the variables oe , ee , X3 determines a subset, namely a plane, 
in the three-dimensional space R*. As an extension of this situation, the totality of elements x of 
a linear space X that satisfy an equation f(x) = « is calleda hyperplane H; f(x) denotes a continuous 
linear functional and « a number. The distance d(y, H) of an element y € X from the hyperplane H 
is defined to be the greatest lower bound of all distances || y — x||, x € H, that is, d(y, H) =inf || y— x]. 

xeH 


In three-dimensional geometry, in which the absolute value of the distance measurement is used, 
an expression for the distance is given by the Hesse normal form. Similarly, in a general normed 
linear space X one has d(y, H) = |f(y) — o|/||f||. If X is a complete space, then, as in the three- 
dimensional space, there exists an element x9 € H whose distance || y — xo|| is equal to the distance 
of the element y from the hyperplane H. 

Optimization problems of control theory frequently involve the determination of the distance 
of a given element y from a hyperplane. 

5. Corresponding to an element u of a normed linear space there exists a continuous linear functional 


f of norm I such that |\u\| = f(u). 


The norm of an element u, which is sometimes difficult to handle, can therefore be represented 
as the value of a linear functional. For example, if x(t) = [&,(t), ..., €,(t)] is a k-dimensional vector- 
valued function with real differentiable components &,(t), where ¢ is a real variable, and if s is a 
further value of the independent variable, then an upper estimate can be given for the norm 


k 
\|x(t) — x(s)||: = 2 \&,(t) — &,(s)| in terms of the argument difference (s — t). By the theorem men- 
=1 


tioned, one imagines a functional f on the space R* chosen in such a way that f(x(t) — x(s)) 
= ||x(t) — x(s)|| and ||f is = 1; that is, one imagines k real numbers f,, ...,,, Satisfying the con- 
k 


ditions ||x(t) — x(s)|| = Dy FilE (2) — &,(s)] and max | fi| = 1. Taking g(t) = Pp fi &(t) and using 
the first mean value theorem of the differential calculus one finds that ||x(t) — ” x(s)| = 9(t) — 9(s) 
=g(t):(t—s)< TIE -|t — s| for some point t between s and ¢. 

im1 


6. Extension of functionals. Occasionally a continuous linear functional is defined only on a linear 
subspace, and the problem then arises of defining the functional on the rest of the space in such 
a way that it remains linear and continuous and, if possible, also preserves the norm. Theorems 
concerning such an extension have been proved, for example, by HAHN, BANACH, KREIN and 
RUTMAN. 


The use of functional-analytical methods in approximation theory 711 


The use of functional-analytical methods in approximation theory 


Approximation theory is concerned with the problem of giving methods for the approximate 
solution of equations of widely differing kinds, for example, of differential or integral equations. 
In an abstract scheme for such equations an operator A is given transforming the element x of a 
complete normed space _X into the element y of a normed space Y, and an element x* in X is sought 
that is mapped onto the zero element O of Y, and therefore satisfies A(x*) = O. In many cases a 
method for determining x* approximately proceeds by an iteration process: one rearranges the 
equation A(x) = O in the equivalent form x = B(x), chooses a first approximation x9 more or less 
arbitrarily and then forms the further approximations x, = B(%o), x2 = B(x), x3 = B(x2), --.; 
Xn = B(Xn_1), --- By the Banach fixed point theorem it can be decided, as a rule, whether the sequence 
{x,} converges to x*. 


Banach fixed point theorem: Jf B is a mapping of a subset M of a Banach space into itself and 
if B satisfies, for all elements x, y € M, a Lipschitz condition || B(x) — B(y)|| < L\|x — y|| with a 
Lipschitz constant L < 1, then for every arbitrary initial approximation xg in M, the sequence of 
approximations x, = B(x9), Xz = B(x,), ---, Xn, = B(%,_1), --- converges to the unique solution 
x* in M of the equation x = B(x), and an estimate of the error is given by 


\|x* — xq] < [L/(1 — L)) xn — Xn-1|| < [L"/(1 — Z)] |lx1 — ol. 


Example J: Let X = Y be the set of the real numbers, A(x) = x — sin x — 1, B(x) =sinx + 1, 
the subset M the interval 1/2 <= x <2. M is mapped by B onto the interval 2> x > 1-+ sin2(>2/2). 
For all x,y¢M the Lipschitz condition |B(x) — B(y)| = |sinx — siny| < L« |x — »| holds 
with L = |cos 2|. Starting from x9 = 2/2 = 1.571 one obtains x; = 2, x2 = sin2 + 1 = 1.909... 
The estimate of the error gives |x* — x,| < 0.066. The exact cate correct to three decimal 
places, is x* = 1.935 .. 


Example 2: Let X= Y= L(0, 1), [A(x)] (s) = x(s) — (1/2) | a d¢t— 2 (=0 for 


0=s= 1), [B(X)] (5) = (1/2) f yaks d¢+ 2, M=L,(0, 1). A square integrable func- 


oO 
tion x is mapped by B again into a rie integrable function. For all x, te £,(0, 1) holds 


1 
_ (1620-20 Gf 
uate) — Bon = {> fT Pe a 
0 0 


1 
I | 
ds < z | x(t) — y(4)I? dt =F Ilx — YIP. 
Li 


The conditions of the Banach fixed point theorem are therefore satisfied with L = 1/2 and an 
arbitrary square integrable initial function, say xo(s) = 0. One then obtains the sequence of ap- 
proximations x,(s) = 2, x2(s) = In ((2 + s)/(i + s)] + 3,... The error estimation gives for the 


function x; a mean square error ||x* — x,|| = (fii (2 + are + s)]|? ds)'/? = 0.48. The exact 
solution x*(s) is not known. 


As can be seen, functional-analytical methods are of great value even in numerical problems, 
such as arise every day in engineering practice. 


41. Foundations of geometry — Euclidean and non-Euclidean geometry 


Foundations of geometry ...........0000 711  Non-Euclidean geometry 


Euclidean geometry is the oldest and historically most important example of a deductive scientific 
discipline. Down to modern times it has been a model of an exact science and it became the starting 
point for a systematic development of the foundations of geometry. This development began at 
the turn of the 19th century with the discovery of non-Euclidean geometry, reached its zenith in 
the investigations of HILBERT, and now covers a wide field of inquiry. 


Foundations of geometry 


Euclid’s Elements. In his Elements (the Stoicheia), EUCLID of Alexandria (c. 365-300 B.C.) gave 
a synopsis of the mathematical knowledge of his time. They contain propositions from mumber 
theory, for example, the Euclidean algorithm and a proof of the existence of infinitely many prime 


712 41. Foundations of geometry — Euclidean and non-Euclidean geometry 


numbers, from solid geometry the theory of regular polyhedra, also the theory of proportion and 
of similarity together with a discussion of incommensurable quantities, and problems of plane 
geometry. The importance of the work lies in the fact that in it the theorems of geometry — with 
certain restrictions, according to present-day knowledge — are proved without recourse to the real 
world, but purely by logical deductions from a set of axioms. 

ARISTOTLE [c. 384-322 B.C.] regarded axioms as statements that are self-evident, stemming 
directly from experience, and containing only concepts about whose meaning there can be no doubt. 
For this reason EUCLID gave definitions of his basic concepts, for example, a point is what has no 
parts. But in the subsequent deductions no use is made of these definitions. 

It was only realized in the 19th century that Eucim tacitly uses properties of order without 
having stated them as axioms. These inadequacies of Euclid’s system of axioms were removed by 
HILBERT in 1899 in his book Grundlagen der Geometrie (Foundations of geometry), which at the 
same time answered scientific questions of a new and fundamental character. According to Hilbert’s 
axiomatics, questions concerning the nature of the basic concepts or their relation to real objects 
do not belong to the mathematical theory concerned, but to its metatheory. The axioms only lay 
down certain relationships between the fundamental concepts. Many of Euclid’s theorems are in 
principle just as easily verifiable as the axioms themselves; thus, to the modern way of thinking 
it is merely a matter of convenience in the systematic development of the theory that a particular 
axiomatic development is used. 


The parallel axiom. Euclid’s own formulation of this axiom or, as he calls it, postulate already 
makes it seem less self-evident than the others. 


_ Euclid’s version of the parallel axiom: Me ce beaks be mn ee ie nets 
: this line are togeth than two right al Rares tee vo 


A whole series of apparently self-evident 
statements therefore led to fallacious ‘ proofs’ 
of this axiom. The majority of these state- 
ments turned out to be equivalent to the 
axiom itself. 


41-2 Statement of 
LEGENDRE 


41-1 The parallel postulate of EUCLID 


onius c ines are es oa 2, Procts(, 500 AD.) If 


The importance of this axiom was made clear, and the foundation was laid for both the modern 
interpretation of geometric axioms and of axiomatic systems in general, when Gauss (1777-1855), 
LOBACHEVSKII (1792-1856) and Janos BOoLyAlI (1802-1850) constructed, independently of one another, 
a geometry in which the parallel postulate does not hold. This non-Euclidean geometry showed a 
similar high degree of internal harmony and consistency as Euclidean geometry. Ten years after 
the death of LOBACHEvskiI, BELTRAMI succeeded in finding the first realization of the essential parts 
of this geometry on a curved surface, and later KLEIN fitted both Euclidean and non-Euclidean 
geometry into the larger framework of projective geometry. 


Axiomatic characterization of Euclidean geometry. Euclidean geometry is a categorical theory, 
in which every statement is either true or disprovable in the sense that the assumption of its truth 
leads to a contradiction. On the other hand, it is intuitively related to direct experience, and many 
theorems are the result of experiments with ruler and compass and similar instruments. This highly 
empirical aspect declined with the growing importance of analytical geometry, in which the Euclidean 
plane is identified with the set of pairs of real numbers. HILBERT clarified this connection completely 
and proved that-his system of axioms is categorical, by showing that any two of its models are 
isomorphic; the isomorphism type is that of the Euclidean plane over the field of real numbers. 
By extending his axiomatic interpretation to other number systems, HILBERT was able to use an 
axiomatic characterization of the real numbers, essentially based on discoveries of DEDEKIND, to 
prove that his system of axioms characterizing Euclidean geometry is complete. 


Axiomatic characterization of Euclidean geometry 713 


The basic concepts of Hilbert’s system of axioms for the Euclidean plane are point, line, incidence 
as a relation between points and lines, betweenness as a relation of triples of points, and congruence 
of line-segments and angles. The axioms are grouped in four sections: A axioms of incidence, B axioms 
of order, C axioms of congruence, D axioms of continuity, and in each section the concepts of the 
previous sections are used. For solid geometry a further concept, that of a plane must be introduced 
and the axioms must be modified. The notion of a categorical system of axioms was considerably 
deepened by TARSKI’s result on the completeness and decidability of Euclidean geometry, provided 
that one regards it as an elementary theory in which set variables do not occur, or are at least avoid- 
able in principle. Apart from certain considerations connected with continuity this can be done. 
The full Dedekind axiom of continuity has to be replaced by a continuity scheme that requires 
the existence of intersections only for those Dedekind cuts of lines that can be defined by expressions 
in the basic geometric concepts (see Chapter 15). TARSKI’s result is that every statement of ele- 
mentary geometry can be proved or disproved from these axioms by formal logic, and that there 
exists an algorithm to decide whether a given statement can be proved from the axioms and thus 
whether it is true or false. Such an algorithm, which could be realized on a machine, could even have 
practical uses, since there are many non-trivial problems of elementary geometry, such as tessellating 
problems or partition of polygons into simpler ones, that could then be solved mechanically. Just 
as decidability of elementary geometry reduces to that of the arithmetic of the real numbers by way 
of analytic geometry, so decidability has also been proved for other geometric theories, for example, 
for non-Euclidean (hyperbolic) geometry. 

Since the selection of the basic concepts and axioms of Euclidean geometry is to a large extent 
arbitrary and a matter of convenience or even personal taste, the question arose whether HILBERT’S 
choice could be simplified. 

If one expresses the concepts of line and incidence in terms of a three-variable relation, collinearity 
of points, col (A, B, C), whose intuitive meaning is that A, B and C lie on a single line, then the 
parallel axiom can be rephrased in the following manner. 


Parallel axiom in terms of collinearity: If A, B and P are non-collinear distinct points, then there 
exists a point Q such that col (A, B, R) and col (P,Q, R) cannot both hold for any point R; further, 
if QO’ is any other such point, then col (P,Q,Q’) holds. 


Collinearity itself can easily be reduced to the relation of betweenness, because three points are 
obviously collinear if and only if one of them lies between the other two. It is somewhat harder 
to show that betweenness can be expressed in terms of collinearity. The metrical relations can also 
be reduced; indeed, all of HILBERT’s basic concepts can be formulated in terms of a single three- 
argument relation, for example, cir (A, B; C), which means that A and B are equidistant from C. 
On the other hand, it can be shown that it is impossible to base Euclidean geometry on a single 
two-argument point relation. 


System of axioms for plane geometry based on motions. The group-theoretical foundation of 
Euclidean geometry, which is based on the concept of motion, differs from other axiomatic systems 
of plane geometry more or less closely related to Hilbert’s system in that the axioms of congruence 
are replaced by statements about motions. In proving the congruence propositions a number of 
metatheoretical statements are used. The basic concepts are point, line (as a distinguished set of 
points), betweenness, and motion. The axioms of incidence and of betweenness are retained initially, 
but the axioms of congruence are replaced by statements about properties of the group of motions. 
The following system of axioms leads to absolute geometry if the parallel axiom is omitted, and to 
non-Euclidean geometry if it is replaced by its negation. The symbol P| / means that the point P 
and the line / are incident, in other words: ‘/ goes through P’, ‘ P lies on 7’, or ‘/ contains P’. 


Axioms of incidence: /,. To any two points there exists exactly 
one line passing through both of them; every line contains at least $$ ° R a) 
two points. — J,. Not all points lie on a single line.— /;. Parallel — 
postulate: To any line / and any point P not on/ there exists exactly 41-3 Axioms of betweenness 
one line through P that has no point in common with /. B, and B; 


—{ 


Axioms of betweenness: B,. If R lies between P and Q, then R a “0 
also lies between Q and P, and P, Q, R are distinct points on a single i ee 


line (Fig.). — B,. Of three distinct points on a line exactly one lies ( ee 5 
between the other two. — B;. If R lies between P and Q, and Q lies a. 
between P and 5, then FR lies between P and S§ (Fig.). — B,. If P, G , ‘. 

and R are not collinear and if the line / intersects PQ in a point Z, . a 
between P and Q, then / contains R or a point S between R and P P RV. 


or between R and Q (Fig.). 41-4 Axiom of betweenness B, 


714 41. Foundations of geometry — Euclidean and non-Euclidean geometry 


41-5 The half-planes 
bounded by / 


41-6 The flag defined 
by the triple (P, h, H) 


It is easy to show that under order automorphisms lines, half-lines and half-planes are mapped to 
lines, half-lines, and half-planes, respectively. In particular, flags are mapped to flags. Certain 
automorphisms are called motions, and these are required to satisfy the following axioms. 


Axioms of motion: ,.If « and 8 are motions, then the combination « - £8 (first «, then #) is also 
a motion. — M],.The identity map 1 is amotion. — M;. If F and F’ are two flags, then there exists 
exactly one motion taking F to F’. — M,. For any two points P.and Q, there exists a motion inter- 
changing them; for any two half-lines with a common vertex there exists a motion interchanging them. 


A consequence of these axioms is that the motions form a group, and in particular, that if « is a 
motion, then so is «—!; for if « maps an arbitrary flag F to F’, then by M, there exists a motion 8 
mapping F’ to F. The map y = «-: B maps F to itself, but then y must be the identity, because 
otherwise y? + y would be a second motion mapping F onto itself. Another consequence is that a 
motion is uniquely determined by three non-collinear points and their images. 


(ee eae 


The main theorem on reflections states that the product of three reflections in lines having a point 
or a perpendicular in common, is again a reflection (Fig.). These and all other statements can be 
proved without the use of the parallel postulate, and are therefore also valid in non-Euclidean 
geometry. For proofs avoiding the use of the parallel postulate it is particularly convenient to introduce 
a calculus of reflections. Instead of the partly group-theoretical system of concepts introduced above, 
one now starts out exclusively from the system J’ of reflections, which is regarded as a generating 
set consisting of involutions for a group (the group of motions). Lines are identified with elements 
of J’, so that the elements of J' are also 
called lines. Points are those products 
Im of two lines / and m that are 
again involutions, that is, for which 
(1: m) (1: m) =1. This is equivalent to 
l-m=m-l. Thus, intuitively, points 
are identified with the reflections in 
them. A point P is incident with a line / 
if P-/=1-P. When all the other 
basic concepts are defined in the same 
manner, then geometric theorems 


41-7 Three successive reflections whose 
axes pass through a common point or 
have a common perpendicular give to- 
gether a reflection 


The introduction of coordinates 715 


simply become calculating rules for the group generated by J’. This calculus of reflections 
represents a new process of transforming geometry into algebra apart from classical analytical 
geometry. 


The introduction of coordinates. Apart from the complete axiomatic characterization of a single 
specific model of a geometry, the relations between the classes of models for geometric statements 
and their algebraic characterizations are also important. This kind of problem leads to questions 
about minimal axioms necessary to ensure the existence of certain standard procedures from model 
theory or mapping theory, for example, for the introduction of coordinates. The theories that have 
been created to tackle such problems have a geometrical terminology, but in their methodological 
development they resemble those of group theory or lattice theory; examples are the theory of vector 
spaces, or those of affine or projective planes. 

The models of the axioms /,, J, and J; form the class of affine planes, which have not yet been 
completely classified. Among them the translation planes are distinguished by the fact that to any 
two points there exists at least one translation, or parallel shift, carrying one point to the other. 
They also are required to satisfy the minor theorem d of Desargues. 


Minor theorem d of Desargues: If the corresponding pairs of vertices of two triangles lie on parallel 
lines and two pairs of corresponding sides are parallel, then so is the third pair (Fig.). 


It can be shown that the translations of a translation plane T form a vector space over a skew 
field K (the scalar field of 7), whose dimension over K is even or infinite. The most elegant way 
of defining the scalars is as special linear mapping of the group of translations. 

The dimension of the vector space of translations is precisely 2 if the plane T is a Desarguian plane, 
that is, if the major theorem D of Desargues holds. 


Theorem D of Desargues: If the lines through the corre- 
—_—— sponding vertices of a pair of triangles intersect in a single 
point, and two corresponding pairs of sides are parallel, then 
so is the third pair (Fig.). 


41-8 Desargues’ theorem D and the minor version d 


41-9 Pappus’ theorem P 


This statement D is required only for the proof that for all a, b€ T with a + o and a || b there is 
at least one linear mapping « of the vector group that takes a to b. Thus, if an origin O and two 
coordinate axes / and m through O are chosen, any point P can be represented uniquely by its 
position vector, and this vector can be decomposed into its components relative to / and m. It is 
clear that one obtains coordinates by associating elements of the skew field (scalars) with the vectors 
on a line, and thus with the points on it. If one chooses an arbitrary vector e + o on the line, then 
to every parallel vector a is associated that scalar which takes e into a. 

The propositions d and D are examples of so-called closure theorems. Another pair of closure 
theorems are the theorems P and p of Pappus. The validity of P is a necessary and sufficient condition 
for the field of scalars K to be commutative. 

Major theorem P of Pappus: If alternate vertices of a closed hexagon lie on two lines |, and I, 
and if none of them lies on both lines, and if the lines of two pairs of opposite sides are parallel, then 
the lines of the third pair are also parallel (Fig.). The minor theorem p of Pappus makes the same 
statement under the added condition that |, and |, are parallel. 


If the axioms of incidence hold, then these closure theorems are logically dependent on one 
another in the following order: P-+> D-»> d= p. It is an open problem whether the last arrow 
can be reversed; the first three are not reversible. However, for finite affine planes one has P © D, 
because by Wedderburn’s theorem every finite field is commutative. A geometrical proof of this 
fact has not yet been found. 


716 14. Foundations of geometry — Euclidean and non-Euclidean geometry 


Non-Euclidean geometry 


Decisive connections between Euclidean, non-Euclidean and projective geometry were discovered 
around 1860 by Arthur CAYLEY (1821-1895) and were further developed a decade later by Felix 
KLEIN (1849-1925). They justify Cayley’s statement: ‘Projective geometry is all geometry’. 

Projective geometry can be described by a system of axioms of incidence, order and continuity, 
that differs from the axioms of Euclidean geometry principally in the following points: any two 
lines intersect; the axioms of betweenness are replaced by axioms of a four-argument separation 
relation between pairs of points, because a projective line must always be regarded as cyclically 
closed. 

The transition from projective to Euclidean or non-Euclidean geometry is effected by the introduc- 
tion of the concepts of parallelism and orthogonality. 

Two lines or planes are called parallel if they intersect in the improper or ideal plane of the space. 
Thus, in Euclidean geometry there is only one parallel to a given line through a point not on the 
line, because there is only one line through the point and the improper point on the given line. 
If an absolute polarity, that is, a polarity without a fundamental curve, is introduced in the improper 
plane, orthogonality can be defined for Euclidean space. 

Now instead of a plane, an arbitrary set can be declared to be improper, for example, a quadric 
in projective space, which then splits into the inside and outside of this surface, for example, relative 
to a sphere or ellipsoid. A polarity of the whole space with the improper surface as its distinguished 
quadric defines ‘orthogonality’ for lines and planes in the interior of the surface, by defining a 
line to be orthogonal to a plane if it goes through the pole of the plane. One sees further that in a 
plane, which now consists only of the part on the interior of the fundamental surface, there are 
several parallels to a given line through a point not on the line. In this way one obtains the hyperbolic 
geometry discovered by GAuss, BOLYAI and LOBACHEVSKII. 

If no surface at all is distinguished in projective space and orthogonality is defined by an arbitrary 
polarity on the whole space without a fundamental surface, then the resulting non-Euclidean geometry 
is elliptic geometry, which was first investigated by Bernhard RIEMANN (1826-1866). 


Hyperbolic geometry. Plane hyperbolic geometry is obtained most easily if in the system of axioms 
for Euclidean geometry the parallel axiom is replaced by the following axiom: to any given line 
and any point not on that line there are at least two lines through the point that do not intersect the 
given line. 

To obtain a model for this geometry one attempts to define orthogonality by a polarity with a 
fundamental curve (see Chapter 25.). In the case of a plane one chooses a circle as the funda- 
mental curve (Fig.). The points on the circumference are distinguished as the improper points of 
the model. The points and chords in the interior are the proper points and lines of the hyperbolic 
geometry (in the following they are called A-points and h-lines). It is easy to check that the axioms 
of incidence and order are satisfied by A-points and h-lines, except for the parallel postulate. For 
example, given an hA-line PQ and an A-point R both RU and RV are ‘parallel’ to PQ, U and V are 
not A-points, but one could also take the A-lines a, b and c. To define h-orthogonality one takes 
recourse to the polarity in the Euclidean plane for which the circle is the fundamental curve. The 
h-line ST is h-orthogonal to the A-line P’Q’ if the pole A of the Euclidean line P’Q’ lies on the 
Euclidean line ST, and the pole B of the Euclidean line ST lies on the Euclidean line P’Q’. Since the 
polar of any point on the Euclidean line P’Q’ passes through A, one need only draw a line through S 
and A or through J and A in order to drop an A-perpendicular to P’Q’ or to erect an h-perpendicular 
in S. The A-congruence is defined as follows: two A-segments PQ and P’Q’ are called h-congruent 
if the absolute values of the logarithms of the cross-ratios formed by the intersections of their 
lines with the fundamental circle are equal: PQ is h-congruent to P’Q’ if |In D(P, Q; U, V)| = 
|In D(P’,Q’; U’, V’)|. This definition also determines, in the presence of continuity, the way of 
applying A-congruent segments. Having developed on this basis a definition for the h-congruence 
of h-angles one can show that the sum of the angles in an h-triangle is less than two right angles 
and that two h-triangles are h-congruent if they agree in the three h-angles. 

The preceding arguments give essential information on the group of transformations which 
underlies hyperbolic geometry. According to KLEIN the question is: under what transformation 
groups are the concepts just defined invariant? — To answer this, the model is again interpreted as 
an object in the Euclidean plane. First of all it is clear that the relevant mappings must be collineations 
that carry both the circle and its interior onto itself. Examples of such mappings are rotations of 
the circle about its centre and reflections in a diameter; but translations or dilations and contractions 
are to be excluded. Now the collineations just characterized form a group. It is enough to note, in 
addition, that the mappings of this group preserve congruence of segments and angles. The in- 
variance of congruence of segments is just a matter of definition by means of the cross ratio D. 
A pair of h-orthogonal lines goes into another such pair, because polarity is preserved. In these 
automorphic collineations of the circle one has the congruence group of the model in question. 


Elliptic geometry 717 


41-10 Model of a hyperbolic geometry 


41-11 Model of an elliptic 
geometry 


Elliptic geometry. This geometry cannot be obtained from Euclidean geometry in the same simple 
way as hyperbolic geometry by omitting or modifying a single axiom. For it follows from the axioms 
of incidence, order and congruence of plane Euclidean geometry that to any line there exists another 
one that does not intersect it. Thus, these sets of axioms must be changed; in particular, any two 
lines should always intersect. Elliptic geometry is essentially identical with spherical geometry. 
One regards the surface of a sphere as the elliptic plane, the e/-lines are the great circles of the 
sphere and an e/-point is a pair of diametrically opposite points of the sphere (Fig.). 

Two el-lines always have an e/-point in common, because any two great circles on the sphere 
intersect in two diametrically opposite points. The e/-points (N, S) and (M, T) determine a unique 
el-line, namely (NMST), but one can drop infinitely many e/-perpendiculars from the e/-point 
(N, S) to the el-line (MOTP). If one chooses as elliptical distance between two points the length 
of the shorter arc on the great circle connecting them, then 2/2 is the greatest distance in elliptic 
geometry. The sum of the angles in an e/-triangle is always greater than two right angles. 

In Chapter 17. it was shown that the rotations of a sphere about its centre form a group. In the 
present case this group of transformations of the sphere is the group of congruence transforma- 
tions of elliptic plane geometry. 


42. Foundations of mathematics 


The traditional trends of thought in the philo- Some main results in the foundation of 
sophy of mathematics .......... 000.0 eee WS PUAINCMANCS sceviwin see canoe cdeouks wus 719 


The modern development of the foundations of mathematics began, together with that of mathema- 
tical logic, towards the end of the last century. Today the two fields of study are still closely con- 
nected. 

The problems of the foundations of mathematics or ‘meta-mathematics’ include a broad spectrum 
of questions, from the scientific investigation of special mathematical disciplines to philosophical 
questions on the nature of mathematical statements and mathematical knowledge. As far as the 
kind of question permits, the analysis and clarification of the relevant problems proceed with the 
same precision and rigour as is necessary for the successful treatment of mathematical problems. 
Of course, mathematical tools cannot always be used here without restriction, since the matter 
under discussion is often precisely the clarification of the foundations of the very tools. The fol- 
lowing presentation can give only a first introduction to this type of problem, and can only take 
account of certain points of view. The reader especially interested in questions on the foundations 
of mathematics is referred to the comprehensive report by Andrzej MostowskI (1913-1975) ‘Thirty 
years of foundational studies’ in Acta Philosophica Fennica (Fasc. XVII, 1965). 

It may be asserted that meta-mathematics presents the most advanced scientific theory of a specific 
scientific discipline. This is due to the fact that mathematics, to a greater extent than other sciences, 
was faced at an early stage with the problem of a critical analysis of its foundations, above all owing 
to the high degree of abstraction from concrete objectivity and the resulting demand for precision 


718 42. Foundations of mathematics 


in mathematical definitions and deductive reasoning. The system of concepts of a mathematical 
theory through which certain properties of objective reality are reflected is almost always an idealiza- 
tion or abstraction on a set-theoretical basis, notwithstanding the fact that the names of these 
abstract concepts are usually borrowed from their narrower concrete usage, for example, the abstract 
concepts of a set, measure, point, algorithm and automaton. 

Mathematics of the 20th century has considerably changed in character compared with mathematics 
up to the turn of the century. Almost all mathematical disciplines are treated in an axiomatic- 
deductive way, so that the admissible rules of mathematical inference are precisely laid down 
(see Chapter 15.). However, in the presentation of their substance most theories use not only 
pure logic, but also certain parts of set theory or elementary arithmetic, as long as this is not 
expressly forbidden. Among other things, for reasons of economy of thought in discovering 
knowledge and carrying out the technicalities of proofs, it is quite common in the practical treat- 
ment of a special mathematical discipline to go beyond the narrow framework of this discipline 
and its characteristic language and thereby to develop de facto part of the metatheory of this 
discipline. 

One of the most striking problems of meta-mathematics is the truth problem for mathematical 
statements. Is it a fact that each mathematical proposition stated to be true is really the description 
of an objective feature, or are at least some such statements, for example, the theorem on the pos- 
sibility of well-ordering the real numbers, mere nominalistic constructions, which can be justified 
only by a number of useful applications? — This problem will later be treated in more detail. 


The traditional trends of thought in the philosophy of mathematics 


To a certain extent, mathematical investigations have always been connected with a critical 
analysis of their foundations, corresponding to the state of knowledge at the time; this applies not 
only to the Greek period, but also to the mathematics of the Middle Ages and the early bourgeois 
society. Since it is customary nowadays to refer mainly to the results of the 19th century in the 
foundations of mathematics, these will now be sketched briefly. 

After a period of rapid development, about the middle of the last century a critical scrutiny of 
the foundations of analysis took place. Apart from A. Caucuy (1789-1857), C. F. Gauss (1777 
to 1855), K. WEIERSTRASS (1815-1897) and B. BOLZANO (1781-1848), the names of R. DEDEKIND 
(1831-1916) and G. CANTOR (1845-1918) must be mentioned. One main problem was an exact 
definition of the concept of a real number. The clarification of the number concept given by 
DEDEKIND, WEIERSTRASS, and CANTOR is part of the stock in trade of mathematicians nowadays. 


1. Logicism. Independently of each other, G. FREGE (1848-1925) and DEDEKIND founded 
their theories of the natural numbers ‘logically’, or as we should say today, on a set-theoretical 
basis. Above all, FREGE aimed at founding the theory of natural numbers, and then successively 
the whole of mathematics, on the laws of pure thought and therefore on those of logic. In the ter- 
minology of KANT this would mean proving the analytical character of mathematical propositions. 
FREGE established the connection between logic and mathematics. He laid the foundation of a 
programme which is usually known as logicism. B. RUSSELL (1872-1970) observed that Frege’s 
structure was inconsistent and developed an improved system, which is set forth in the well- 
known work Principia Mathematica (jointly with A. N. WHITEHEAD). Its essential content is 
ultimately the proof that the whole of mathematics can be developed on the basis of the set-theoretical 
(ramified) theory of types. 

A variant of logicism, a radical interpretation which aims at regarding mathematics, as it were, 
as a result of rational thought, is mathematical Platonism; it appears, for example, in the ideas of 
CANTOR on the foundation of set theory. In this interpretation sets are ideal objects, which exist 
independently of intellectual activity. The task of the mathematical investigator is to track down 
the laws that prevail in this very general world of objects, the Cantor universe. 

In a certain sense logicism is absorbed in the set-theoretical foundation of mathematics, which 
will later be described in more detail. 


2. Formalism. The formalistic interpretation arose as the answer to the difficulties in the theory 
of knowledge expressed in logicism. A decisive step in this direction was the book, published in 1899, 
Grundlagen der Geometrie (Foundations of geometry) by D. HILBERT (1862-1943). Here it was 
shown for the first time by the example of geometry what is to be understood by formal axiomatics 
and its metalogical analysis. The programme of a formalistic foundation of mathematics was finally 
formulated by HILBERT in 1920 and taken in hand by him and his school. According to this programme, 
even such mathematical domains as number theory, analysis, and set theory which, by their nature, 
are at first sight specified by their contents, are to be understood as formal theories. The first problem 
for a student of the foundations is to establish formally that the system is consistent that is, to 
show conclusively that a statement and its negation are not both derivable from the axioms 
by means of the laws of logic. This task is to be accomplished by methods whose reliability is beyond 


Logicism — Formalism — Intuitionism 719 


all doubt. To these methods, which HILBERT called finite, there belong elementary combinatorial 
methods, in particular, the principle of proof by mathematical induction, but not those of transfinite 
set theory, the so-called infinitistic methods. The results of these endeavours (up to 1938) were 
recorded in the two-volume work Grundlagen der Mathematik (Foundations of Mathematics) by 
HILBERT and P. BERNAYsS, which next to Principia Mathematica ranks among the most significant 
works on the foundations of mathematics of this century. 

HILBERT’S Original attempt had to be revised on account of the results of K. G6DEL (1906-1978). 
One of these results states that any proof that a formal system is free from contradictions necessarily 
requires methods beyond those provided by the system itself; accordingly, one cannot prove that 
number theory is free from contradictions by means of finite methods in the strict sense. 

Today there is still no clear agreement on the type and the range of admissible extensions of 
finite methods; one of these possibilities is the incorporation of recursive functionals. It is not 
possible to go further into these questions here, except to say that finite investigations are not limited 
to the question of freedom from contradiction, but refer, for example, to decision problems and 
generally to the analysis of the finite core in fundamental mathematical and meta-mathematical 
results of infinitistic nature. 


3. Intuitionism. This point of view, which is totally opposite to the logicistic and formalistic inter- 
pretations, was founded by L. E. J. BRouwer (1881-1966). Similar ideas had been put forward 
earlier by L. KRONECKER (1823-1891) and H. POINCARE (1854-1912). The following points of 
view are characteristic of intuitionism: 1. Rejection of the actual infinite. 2. The postulate of 
effective constructability as the only means of defining mathematical objects. 3. The original material 
of the construction consists of the natural numbers, and these are to be regarded only as a potentially 
(uncompleted) given aggregate. 4. Limitation of the classical logical principles in their application 
to infinite aggregates. 

To explain at least one of these points of view, define 


a= 0 if 2(n + 1) is the sum of two prime numbers, 
"~~ | 1 otherwise. 


The real number g = 0. a,a2 ... could be called the Goldbach number; it can be calculated to any 
degree of accuracy, yet it is not known whether g = 0 or not. Since one cannot be certain whether 
Goldbach’s problem (see Chapter 31.) will ever be solved, there is some justification for the 
argument that it makes no sense to say that either g=0 or g +0. However, this obviously 
implies a certain limitation on the tertium non datur, the law of the excluded middle. 


4. The present situation. None of the schools of thought mentioned above has been able to achieve 
its original aim. In spite of this, the treatment of meta-mathematical questions from different points 
of view has brought to light valuable insights and results, which were not originally intended. The 
questions of decidability posed by meta-mathematics has led at an early stage to a precise formulation 
of computability and the concept of an algorithm. The formal mathematical languages, which were 
made precise by HILBERT and his school, are fundamental for the construction of algorithmic 
languages (for example, ALGOL or FORTRAN) and these examples could be increased indefinitely. 
Today an individual scholar can rarely be classified as belonging to a definite direction, rather he 
follows a dialectic course, by studying questions and results from different, partially contradictory, 
standpoints. The observance of certain differentiated constructivity postulates in the course of an 
investigation is not so much the expression of a certain philosophical position as of the methodological 
principle of not unnecessarily overstepping the bounds of secure knowledge. 


Some main results in the foundation of mathematics 


1. The set-theoretical foundation of mathematics. What remains of the original programme of 
logicism is the knowledge that the whole of mathematics can be built up on the basis of axiomatic 
set theory. This means that today every existing mathematical theory, irrespective of whether it 
has an axiomatic character or concerns a definite domain of objects, can be regarded as a partial 
domain of axiomatic set theory, suitably limited by the aims of the relevant theory. 

The essential contributions to the axiomatic foundation of set theory are due to B. RUSSELL (1872 
to 1970), E. ZERMELO (1871-1953), J. VON NEUMANN (1903-1957), A. FRAENKEL (1891-1965), 
P. BERNAYS (1888-1977) and K. G6pEL (1906-1978). The Bourbaki school has undertaken a well 
thought-out methodical arrangement of mathematics from the set-theoretical point of view and 
has thereby essentially contributed to its popularization. 

The language of axiomatic set theory in, say, the Zermelo-Fraenkel system is a very simple predicate 
language with the single predicate sign €. The axioms correspond to the few so-called principles 
of the contents of set theory listed in the chapter on Set Theory. The rules of deduction are the formal 
rules of natural inference given in the chapter on Elements of Mathematical Logic -— or rules reducible 
to them. The definition of concepts proceeds only by the rules of explicit definition; other, recursive 
or implicit, definitions are reducible to explicit definitions within the framework of axiomatic set 
theory. 


720 42. Foundations of mathematics 


The mathematician is not obliged on principle to go beyond the frame of formal set theory, but 
this naturally applies only to mathematical investigations as such and not to their application to 
physical or other extra-mathematical processes. The question of a suitable mathematical model for 
processes of this kind is not, strictly speaking, a mathematical question. 

It should be observed that many mathematical theories have a complicated formal underlying 
apparatus; it is not always clear how they can be reduced to set theory and how their language 
can be adequately ‘coded’ in the simple language of set theory. This applies as a rule even to the 
presentation of the contents of set theory itself. Compare, for instance, the examples of forming 
sets in the chapter on Set Theory, most of which need to be made precise within the frame of axiomatic 
set theory. To form the set of all subsets of the set of real numbers, one has to have a definition of 
the concept of real number inside axiomatic set theory. This inturn assumes a definition of the con- 
cept of the set of natural numbers and leads first and foremost to an axiomatic analysis of the concept 
of finiteness within the frame of general set theory. 

In addition, all the concepts used in the semantics of formal languages can be defined set-theoretic- 
ally; this holds, in particular, for the linguistic objects themselves. These have been explained as 
finite sequences of certain ‘symbols’. Just as in set-theoretical topology the elements of a set struc- 
tured in a certain way are called points, so the elements of an arbitrary (usually denumerable) set 
can be designated as symbols. 


2. Criticism of the set-theoretical foundation based on results in the foundations of mathematics. 
The question arises whether the possibility of a set-theoretical basis for mathematics as a whole 
can be regarded as throwing sufficient light on the problem of a meta-mathematical foundation of 
mathematics. Although by the reduction of the whole of mathematics to set theory the meta-mathema- 
tical problems appear to be reduced to a large extent to those of axiomatic set theory with its 
simple language and easily comprehensible axioms, this would in fact be a fallacy, for various 
reasons some of which will now be briefly discussed here. 

(i) One immediate objection is the question of consistency of the formal system of set theory. In 
respect of actual experience the set-theoretical axiom system is on too high a level of abstraction 
to be able to speak of a direct verification. A kind of empirical control exists at best for certain 
consequences of these axioms, for example, for existence statements on solutions of differential 
equations with certain boundary conditions. 

Hence it is not very surprising that set theory, just at the beginning of its development, had to 
eliminate a number of serious paradoxes in its system of concepts, which despite their removal 
here caused a permanent mistrust on the part of many mathematicians in too free a use of infinitistic 


methods. 
(ii) A further objection concerns the incompleteness in principle of the set-theoretical axiom system, 


to be discussed in the next section, in the sense that in any far-reaching (recursive) axiom system 
there are statements that are independent of this axiom system. There is therefore no hope of com- 
pletely grasping the intuitive universe of sets even approximately, through a chosen fixed axiom system. 

(iii) Despite the facts mentioned above it could still be assumed that a certain object domain U 
(the intuitive set universe) corresponds to an accepted set-theoretical axiom system A, and that any 
set-theoretical statement is either true (that is, valid in U) or not. It is then possible to speak of the 
structure <A, U> within a suitable meta-language L*. In L* a well-known result of T. SKOLEM 
(1887-1963) — known as the Skolem paradox — can now be formulated; it states that there are several 
non-isomorphic models not only of the axiom system A, but even of the syntactically complete 
system of all statements valid in U (see Chapter 15.). But then the idea of a standard model, that 
is, a model of the axiomatic set theory that is distinguished in a certain way, becomes completely 
doubtful. 

These objections make it clear that the aim of a logical-empirical foundation of mathematics, 
particularly in the classical form of Cantor’s Platonism, is unattainably far removed. One is therefore 
justified in asking whether a universal set-theoretical foundation of mathematics is a factual require- 
ment, or whether principles of a more constructive character are perhaps sufficient for this purpose; 
as regards the actual application of mathematical methods in extra-mathematical fields, on closer 
inspection essentially only constructive methods prove practicable. In the present state of things 
it must certainly be said that the infinitistic methods of set theory cannot be abandoned. Metaphorically 
speaking, the guns of infinitistic set theory have so far an unsurpassed range. This applies particularly 
to the capture of the constructive methods of the applications of mathematics. Furthermore, it 
can be said that — despite the fact that the Cantor universe proves to be a fiction on closer analysis — 
the mathematician, particularly one who studies the foundations, obtains his results, as a rule, 
only on the basis of a certain intuitive idea of an abstract mathematical reality. 


3. Incompleteness of axiomatic theories and the indefinability of the concept of truth. In judging 
a mathematical theory T created with the object of providing a model for a certain domain of ob- 
jects, for example, physical space or certain physical or economic processes, the only significant 
thing is the success. Since T represents only one consciously chosen idealization of a real process, 


Incompleteness of axiomatic theories 721 


the question of the truth of statements in T is of secondary importance. However, the truth problem 
becomes relevant for the whole of mathematics, being a closed science. The same is true for any 
branch, for example, the theory of natural numbers or set theory, that by its origins is not an axiomatic 
theory, but the description of a certain possibly abstract domain of objects. 

To be sure, a great many mathematical statements, in spite of their abstract character, have an 
immediate relationship with reality. Consider, for example, the following theorem, whose validity 
is evident: ‘if there exists a division of a finite set S into n disjoint classes C,, ..., C, in which each 
class contains exactly m elements, then there also exists a division of S into m disjoint classes with n 
elements each. 

The situation is quite different with respect to the statement, widely, accepted nowadays, that 
‘there exists a well-ordering relation on the set of real numbers’, and generally for existence statements 
in which nothing is said about the method of construction of the object in question. 

If U is a certain domain of objects (universe of discourse), and L a formalized language over U, 
then it is known to be possible (see Chapter 15.) within the frame of a metatheory over L and U 
to make precise the concept of validity or truth of a statement of LZ in U. The first question is 
whether there is a codifiable axiom system A such that the set of statements derivable from A by 
the rules of formalized reasoning coincides with the set of true statements over U. In some cases 
this is in fact possible, for example, when U is a finite universe of discourse or when the language 
Lis so lacking in expressive power that it does not even permit the formulation of complicated proper- 
ties of U. 

The following refers to a domain U that contains the natural numbers and to a language L in 
which the arithmetic of the natural numbers is expressible. Under these assumptions the first in- 
completeness theorem of Gédel holds, according to which any axiom system formulated in L that 
consists of finitely many or, more generally, of a recursive set of axioms is incomplete in the sense 
that not all true statements in U can be derived from A. A further fundamental result is the theorem 
of A. TARSKI (b. 1901) that under the given assumptions no predicate W(x) is definable in L 
such that for an object a of U the statement W(a) is true in U if and only if a is the code number 
of a true statement in U. 

For the proofs of both theorems a codification, also called an arithmetization or Gédelization, of 
the language L by natural numbers is carried out. This is done in such a way that firstly natural 
numbers are assigned to the fundamental symbols so that sequences of symbols correspond to 
certain finite sequences of natural numbers. In the second stage the finite sequences of natural 
numbers are put in one-to-one correspondence with the natural numbers. The natural number that 
corresponds in this way to an expression H is called the code or Gédel number of H and denoted 
by H*. 

Let L, U and their semantic correlation be included in a new 
universe of discourse U, and let L be an adequate language for 
U. Thus L is called the object language of U, and L the meta-lan- Z 
guage of the system <L, U> (Fig.). A 

The coding of L makes it possible to project certain predi- oder 
cates of U which in the first instance are only metalinguistically | 
expressible, into the object language L. \ 

An example of a predicate in the metaobject domain is the \ 
one-place predicate ‘the statement A is provable (from A)’. To 
this there corresponds a certain arithmetical predicate B(n), 
which is true for a natural number n if and only if ” is the code 
number of a provable statement. Under the assumptions 
made on U an expression Nb(v) can now be constructed so that on substitution of a natural 
number n for the variable v the statement Nb(n) says: ‘the statement with the code number n 
is unprovable (from A)’. By means of a further device, a so-called diagonal argument, one can now 
find a natural number m so that m = Nb(m)*. The statement Nb(m) can then be regarded as a Sel/- 
referring statement with the meaning ‘I am unprovable’. 

Nb(m) is valid in U, otherwise its negation ‘I am provable’ would be valid in U; since each statement 
provable from A is naturally also valid in U, the statement Nb(n) with the code number m in U 
would be both valid and invalid, which would be a contradiction. However, in accordance with the 
meaning of Nb(m), the validity of Nb(m) also indicates its unprovability. Hence the axiom system A 
turns out to be incomplete. 

The result of TARSKI is obtained in a similar way. One assumes that an expression W(v) exists 
with the meaning ‘v is true’; the negation Nw(v) of this expression then represents the predicate 
‘vy is untrue’. A self-referring statement Nw(m) (Nw(m)* = m), as constructed above, would then 
mean ‘I am an invalid statement’. This statement would be true if it is false and false if it is true. 
One can escape from this contradiction only by dropping the assumption that the predicate ‘v is 
true in U’ is expressible in L. 


Meta language c 


Object domain U ) U 


722 42. Foundations of mathematics 


This kind of argument concerns deep analysis of a paradox already known in antiquity, which 
can be put in the following form: ‘The theorem printed in red type on page n of this book is false.’ 
This statement too is false if it is true and true if it is false, and therefore infringes the principle of 
excluded contradiction. 


4. Relative consistency and the independence of the continuum hypothesis. A theory T is called rela- 
tively consistent with respect to a theory 7’ if the fundamental concepts of T can be defined in the 
language of 7’ in such a way that the axioms of 7 correspond to certain statements valid in 7’; the 
theory 7 is then said to be interpreted in the theory 7’. 

The definitions just given are of a metatheoretical nature. Frequently the proof of this inter- 
pretability can be conducted completely inside the language of 7’, although model-theoretical 
arguments about T and J’ succeed more quickly. 

Of particular interest is the special case when T”’ is an extension of T within the same language L. 
In particular, a statement A of L is called consistent with respect to T if the theory T v {A} is relatively 
consistent with respect to 7; for example, it can be shown that Euclidean and non-Euclidean geo- 
metry are relatively consistent with respect to absolute geometry, that is, that the parallel axiom 
and also its negation are consistent with respect to the other geometrical axioms, or briefly that 
the parallel axiom is independent of the other axioms. The fact that the proof of this can be carried 
out entirely within absolute geometry is by no means trivial; however, by modern standards, the 
proof of independence is almost a banality if it is carried out by model-theoretic means, that is, in 
this case by means of analytic geometry. 

GODEL showed in 1938 that the continuum hypothesis and the axiom of choice are relatively 
consistent with respect to the other axioms of set theory. Twenty-five years later P. COHEN showed 
that the negation of the continuum hypothesis is also consistent with respect to the other axioms. 

Although these results have a formal analogy with geometry, the situation is quite different, 
since it is possible to set up the different kinds of geometry from a unified standpoint, namely that 
of general set theory. However, there is no unified principle for founding the different, mutually 
exclusive, systems of set theory. According to the present state of affairs, such principles of a mathema- 
tical nature do not even seem to exist, because a higher mathematical abstraction than that of set 
theory is absolutely inconceivable. 

GODEL himself has expressed the view that the development of set theory will lead to new axioms, 
which will allow the continuum hypothesis to be disproved. The axioms so far taken into discussion 
for the extension of the usual bounds of set theory, for example, the axiom of TARSKI on the existence 
of inaccessible cardinal numbers, are not likely to suffice for this. 

Tarski’s axiom is an example of an axiom that ensures the existence of further sets, beyond the 
domain produced by the principles of set formation and choice. The acceptance of such axioms 
could be described as an unlimited extension of mathematics. It must be remembered, however, 
that the growth of new axioms of unlimited character is not a cogent demand and would cause 
new serious problems of consistency. There are certain limited ways of attaining the above mentioned 
possibility of extension, among them all constructivistically oriented statements. One kind of semi- 
constructivistic limitation on the boundless formation of sets would be the acceptance of Gédel’s 
constructability axiom, which would imply the validity of the continuum hypothesis. 

It can be said in conclusion that the result of research on the foundations of mathematics has 
made essential contributions to the clarification of the range and the bounds of the classical statements 
on the foundation of mathematics, and has, in addition, provided numerous practical applications, 
for example, in the theory of algorithms and the theory of formal systems. 


Matrix games 723 


43. Game theory 


Conflict situations . 0.0.00... cc ees 723 Linear programming with several objective 
MGIFIX BQIMNES oc oes Rea See ee eRe CRIS T23 JUNCTIONS 2 22cs0elodenes o62d5n08e Oak 728 
Solutions to matrix games ..........00085 726 Other GAMES 6... eee ee 729 
Simplex algorithm 0.0.0.0... ces 727 


As early as the 17th century attempts were made to analyze games of chance and parlour games. 
A multitude of these games continue to be with us today, and in some (such as roulette) the outcome 
is purely accidental, in others (such as bridge) it depends on chance and the players’ behaviour, 
while a third group (such as chess) is completely controlled by skill. 

In 1943 J. VON NEUMANN and O. MORGENSTERN were the first to provide a general description of 
the links between economic problems (competitive situations) and games, thus establishing the 
theory of games as we know it today. Nowadays it is seen as a discipline in the wider field of mathe- 
matical operations research. 


Conflict situations 


Nature and society are replete with cases in which the parties involved have conflicting interests 
and pursue them in different ways. Such a competitive situation is easily recognizable in a parlour 
game, military confrontation or economic competition. However, in some problems occurring in 
game theory it is necessary to construe a conflict situation. 

Such a situation is mathematically modelled as a game where the players may be the natural 
persons attending a party or, in a more general sense, companies, armies, ships, nature etc. These 
players can choose to follow a specific course of action as embodied, for example, in the rules of a 
parlour game. They will then do their best to use this leeway skillfully to achieve their goals. One of 
these goals may be to win a game of chess. 

The theory of games is concerned with describing and modelling these relations in mathematical 
terms and finding the best possible strategy for a player. 


Matrix games 


The most basic type of game is one involving only two people who act on the understanding that 
the gains made by one player are equal to the other’s losses. Gains and losses may be measured in 
sums of money, and the total amount paid out in this case will be zero. Hence the name — two-person 
zero-sum games, also known as matrix games since they can be fully described by a matrix A. 

A matrix game means that the m courses of action H; open to player P, are assigned to the m 
rows of a matrix A. Similarly, the 7 columns of the same matrix A are assigned to the 7 courses of 
action 4, open to player P,. This will make him the column player, while P; is known as the row 
player. 

For actual play P,; selects a row i and P, acolumn k of the matrix A = (a;,,), both players pro- 
ceeding independently of each other and without informing the other. A pay-off is then made by 
P, to P,; at the rate of a,,. P; gains something if a, > 0, there is no pay-off if a;, = 0, and he loses 
if ay, < 0. The matrix A is known as the pay-off matrix for P,;, and in many examples and applica- 
tions the pay-off is merely symbolic. 


As play proceeds, the following conflict situation arises: P, aims to maximize his gains by selecting 
suitable rows, whereas P2 seeks to minimize his losses by choosing the right columns. In mathematical 
terms, these goals may be described by introducing the concept of strategy. A strategy indicates 
the probability x, with which the course of action H; (the row /) is selected in a particular game. 


If a component x, equals 1 (which means all other components are zero), then only the row / will 
be selected in any game. This strategy x is known as a pure strategy; all other strategies are mixed 
strategies. By analogy, any vector y where y? = (yi )2--- Ya)» OS Me SI, ty2 tcc tn = 1 
is a strategy for P,. Pure and mixed strategies are defined on the analogy of P,. 

In terms of probability theory the average pay-off P, makes to P, is x’ Ay. When a fixed strategy 
x is chosen, then P, is certain to gain min x’ Ay. This means that P; must select a strategy x = xo 
to maximize his gain. 4 


724 43. Game theory 


Evidently, max min x'Ay = min max x! Ay. J. VON NEUMANN has proved the following essential 
result. 


Main theorem for a matrix game >? max min xT Ay = min max txTAy = v. 


If P, abandons an optimal strategy xo his gains may be less than the value of the game »; if P2 
strays from an optimal strategy yo his losses may exceed v. The optimal strategies xo and yo therefore 
describe the best possible approach the two players can take. The average pay-off made by P2 to 
P, is then v, and a game where v = 0 is a fair game. 


Pure optimal strategies. A special case among the matrix games is the saddle-point game with 
pure optimal strategies xo and yo, as illustrated in the following example. 


Each of two players P; and Pz has the four aces from a pack of cards. The idea of the game 
is for both to put an ace face upwards on the table at the same time, followed by a pay-off under 
a previously agreed rule. That is the end of the game, and the amounts P; pays to P, can be seen 
from the following pay-off matrix A. 


Clubs Spades Hearts Diamonds| Row 


minimum 


Clubs 


Spades 
Hearts 0 
Diamonds | 


Column 


The figures in the pay-off matrix A represent the following: If P; lays down the ace of clubs 
and P; the ace of hearts, then P; has to pay two units to P,. If both players present the ace of 
spades, P; will pay one unit to P,. Here the course of action open to a player is to choose from 
his aces. If this game were to go on for some time, P, will appear to be favoured because there 
are more positive figures in the pay-off matrix than negative ones, but this is a fallacy. 

Let us now determine the optimal strategies for both players, assuming that neither wants 
to take a risk. If P, selects the first row (ace of clubs), he will be worst off if P, chooses the fourth 
column (ace of diamonds) as he loses one unit. P, will also lose if he opts for the second or fourth 
row. but is certain to lose nothing by consistently playing the third row (ace of hearts). In that 
case he may even win if P, makes a mistake. This means that P; can at least be sure of gaining 
the amount that is the greatest row minimum, i. e. w, = max (—1, —1, 0, —2) = 0. This applies 
by analogy to P,. If he plays the first column (ace of clubs), the worst response he can get is the 
first row (ace of clubs), which will lose him three units. He also loses if he opts for the third or 
fourth column, but can be certain to lose nothing by playing the second column (ace of spades) 
consistently. He may in turn win if his opponent plays badly, and he can limit his losses to the 
least column maximum, i. e. wz = min (3, 0, 2, 3) = 0. 

As w,; = w>2 = 0, neither side has an advantage, which makes this a fair game. The optimal 
strategy is for P, to always show the ace of hearts and for P, to stick to the ace of spades. In 
the pay-off matrix this takes the form of the saddle point shown in red. The solution to the game 
iS x$ a (0, 0, 1,0), ¥4 = (0, |, 0, 0), —— 0. 


There is no saddle point if w, + w 2, and this type of game becomes uninteresting once the optimal 
strategies are known, as in our case. 


Matrix games 725 


Examples of saddle-point games include chess, go, checkers and mill. In each, there would 
only be three possibilities if the players followed optimal strategies: a) white always wins, b) black 
always wins, c) there is always a draw. In the case of chess, go and checkers it is still unknown 
which of the three applies. The reason is that we cannot define the optimal behaviour of a player be- 
cause of the many courses of action open to him under the rules of the game (but this is not the case 
with mill). This also explains why a chess computer will lose on occasion. By contrast, the options 
in an end game are a simple kind of chess problem and identical with the moves to be made. 


Mixed optimal strategies. These occur in matrix games without a saddle point, and the optimal 
strategy of a player can be determined in a diagram if only two courses of action are open to him. 


A case in point is the gentleman =~ Morning Afternoon 
(P,) who wishes to encounter a lady ; 
(P,) as often as possible over a 
weekend, whereas she wants to see 
as little of him as possible. The 
options for him are to go out either 
on Saturday or Sunday, while she 
may do so in the morning or after- 
noon. He then draws up a table 
illustrating his chances of meeting 
her. 


She is of the same opinion, as regards the chances of an encounter. The 2 in the table means he 
is very hopeful of meeting her while doing his shopping on Saturday morning, while the —0.5 
tells us she is rather certain of not running into him on the afternoon of the same day because of 
his football craze. 

This is not a saddle-point game because the greatest row minimum w,; = max (—0.5, 0) = 0 
differs from the least column maximum wz = min (2,1) = 1, w, is less than w,. The game con- 


43-1 Graphic solution to a matrix game 


726 43. Game theory 


tinues through several weekends a total of M times, with P, choosing Saturday m, times and 
Sunday mz times so that m, + mz, = M or x, + x2 = | if x; = m,/M and *2 = m/M. P, 
accordingly chooses the oan n, times, the afternoon nz times, and y; + y2 = 1 if un. =n,/M 
and y2 = m2/M. The vectors x (x1X2 and y’ = (y,y2) are strategies for P, and P. 

The optimal strategies Xo and Yo can now be graphically determined. The average Ghinces for 
P, per weekend, if P2 alway , chooses the morning, are: C = 2x, + Ox; = 2x,. If P, invariably 
opts for the afternoon PS s chances are C = 0.5x, + lx2 = —0.5x, + 1 because x, + x2 = 1. 
The two equations correspona to straight lines in a rectangular Cartesian coordinate system, where 
the chance C is the ordinate and x, the abscissa. It must be borne in mind in this connection that 

x, is a relative frequency with a value that can only range from 0 to | (Fig. 43-1). The coordinates 
for the point of intersection of the two straight lines are x, = 2/7, C = 4/7. On the C axis the 
chances of P; can be read off as follows: 


(1) If P, opts for Saturday with a relative frequency of 0 < x, < 2/7, his chances are less than 
4/7 in the worst possible case (Pz goes out in the afternoon). . 
(2) If, on the other hand, P; chooses Saturday with a relative frequency of 2/7 < x, < 1, then 
the chances are again less than 4/7 in the worst possible case (P; goes out in the afternoon). 


The optimal strategy for P, is therefore xf = (2/7, 5/7), and the value of the game C = 4/7. 
This means that our gentleman has to adhere to a 2:5 ratio of Saturday to Sunday without any 
regularity and the game is wnfair as far as P, is concerned. 

Now P, tries to find a strategy limiting her mean loss to 4/7. Her average chances per weekend, 
if PF, consistently opts for Saturday, are C = 2y, — 0.5y, = 2y, — 0.5 (1 — »,) = 2.5y, — 0.5 
=< 4/7. If P, always goes out on a Sunday her chances are C= 0-y; + y2 = | — », < 4/7. 
From the first inequality it follows that », < 3/7 and from the second, yy = 3/7. The optimal 
strategy for P, is then y} = (3/7, 4/7) so that P, selects a morning to afternoon ratio of 3:4 with 
ro regularity. So much for a complete discussion of this game. If one of the players abandons 
his optimal strategy, the other is favoured. 


Two other well-known examples may be given for matrix games without a saddle point. 


Example | is the penny game where each of P, | Heads Tails 
the two players reveals the side of a coin at the | 
same time. If the sides are the same, P; wins, 
otherwise FP, is the winner. The pay-off matrix | 
A for P,; then looks as follows: Heads ae 

As in the above example, the optimal strategies 
for P, and P2 can be graphically determined, | 
which results in x} = (1/2, 1/2) and y3 = (1/2, } Stone Paper Scissors 
1/2). The solution tells us that both players have 
to alternate between heads and tails without 
regularity. 

Example 2 is the counting-out game played by 
children known as stone-paper-scissors, with the 
following pay-off matrix. 

Due to the symmetry the optimal strategies in this game are obvious: xf = (1/3, 1/3, 1/3) and 

= (1/3, 1/3, 1/3). The fact that the components in the optimal strategies are of equal magnitude 
means that both players must vary their options from one game to the next in an unsystematic 
manner. The value of the game is p = 0, which makes it a fair game. 


Solutions to matrix games 


In a saddle-point game it is the saddle point that determines the optimal strategies x9 and yo, 
and the value of the game, v. If a player has only two options, his optimal strategy and the value of 
the game can be shown in a diagram as before. 

In other cases, approximate methods or the simplex algorithm can be used to determine the 
solution X9, Yo and v (see Chapter 30.1.). 

Approximation methods. By modelling the learning process for two players we obtain a very 


basic (if slowly converging) approximation method. This can be illustrated with the following 
example. 


A farm has three implements for the same operation which differ in their suitability depending 
on soil conditions. This suitability (which may be expressed in operating days per week for the 
previous year) is shown as a function of soil conditions in the following table, matrix A. 


Simplex algorithm 727 


dry normal wet Player P, is the farm, player P; the weather, 
and the game itself belongs in the category of 
games against nature. The solution tells us 
how often we can expect to use each of the 


Implement | 


2 three implements and what “‘action” nature 
Implement 2] 3 0 0 may take. 
Implement 3} 0 2 I | 


The approximation procedure consists of the following steps: 


(1) P, arbitrarily selects a row from matrix A, for example the first row, and writes it under the 
matrix. 

(2) P, then chooses that column of matrix A which contains the lowest figure in that row (column 2) 
and writes it next to the matrix. 

(3) P; selects the row that has the highest figure in it (the third row) and writes the sum of the last 
and the newly choosen row under the matrix. 

(4) P, again chooses the column which contains the lowest figure in that row and writes the sum 
of the last and the new row next to the matrix. 


The procedure continues, and if the smallest figure in a column or the highest figure in a row occur 
more than once, an arbitrary choice is made. In addition, the figures selected each time are marked 
in red when searching for the next row or column. 

Approximations for the optimal strategies are obtained when this algorithm is followed: count 
the red figures in each of the three rows (next to A) formed by the columns and in each of the columns 
(under A) formed by the rows and divide by the number of steps. For example, three figures are 
marked in the third row (2, 4, 5) and the number of steps is five, so that x3; ~ 3/5. 

The value of the game, v, is contained between the last marked figure in the columns (4 in our 
example) and the last marked figure in the rows (6 in the example), divided by the number of steps 
in each case. The total result obtained after five steps for the optimal strategies xo and yo, and for 
the value of the game, », is: xf ~ (0, 2/5, 3/5), yh © (2/5, 2/5, 1/5), 4/5 < v < 6/5. 

This approximate solution is still far too vague, but it becomes more practical after ten steps: 
x2 z (1/10, 3/10, 6/10), 5 = (3/10, 4/10, 3/10), 9/10 < » < 11/10. 

The last approximate solution tells us that the implements will be used at the ratio of about 
1:3:6 and that the ‘‘options” of nature (dry — normal - wet) are about 3:4:3. 

The more steps that are carried out in the procedure, the closer we will get to the solution of the 
matrix game. To compare, let us look at the accurate solution, obtained with the aid of the simplex 
algorithm (see below): x3 = (1/4, 1/4, 1/2), »§ = (1/3, 1/3, 1/3), v = 1. 

The conclusion is that implement no. 3 should be ready for use at all times if possible. The value 
of the game is only symbolic. 


Simplex algorithm 


Each matrix game can be transformed into a problem of linear programming and solved with 
the aid of the simplex algorithm (see Chapter 30.1.). 


728 43. Game theory 


For this purpose, the matrix game with the pay-off matrix A is assigned the two dual linear opti- 
mization problems (see Chapter 30.) max {e'v | Av < e, v > 0} and min {e7w| A7w > e, w > 0}. 
In the vectors e all components are equal to 1. The vectors v and w contain the variables of the 
optimization problems. 

If we let the optimal solutions yo and wo and the common optimal value of the objective function 
c of the two optimization functions be determined with the simplex algorithm, then the optimal 
Strategies xo and yo and t 


wa Deer a ff at Ripe ee ue 
hE A + oe o Bsca ie ae wa = i i 1 
SI ae | le NE ee | 

Xo = —Wo, Yo = — Co- 
Cs te Sane) Aa 


The following example illustrates how a game can be solved using the simplex algorithm. The 
model is one of a military conflict in the Middle Ages in which General BLotro is defending a 
town with two gates. His forces consist of three troops, while the enemy is attacking with two; 
all troops are of equal strength. The town is considered to have fallen to the attackers (value of 
the game = 1) as soon as the troops attacking at either gate outnumber the defending troops. 
A ceasefire results if equal numbers of troops attack and defend (value of the game = 0), and the 
town has been held if the defenders outnumber the attackers (value of the game = —1). The 
courses of action open to the two opponents are to assign an optimal number of troops to the 
gates. For the attacker this results in the pay-off matrix A as follows. 


Blotto Gate | 
Gate 2 


Attackers 
Gate | Gate? 


The attacks continue unremittingly. What are the optimal strategies for both sides? Who has 
any advantage? 
According to Chapter 30.1. the following simplex tables can be drawn up. 


(1) (2) 

Wy l Wy | 

Wy l W's 1 
l Uy» ] 


This gives the optimal solutions vj = (0,2, 1,0), wi = (1,1, 1) and ¢ = 3, and the optimal 
strategies xf = (1/3, 1/3, 1/3) and yf = (0, 2/3, 1/3, 0). Hence the attacker must make equal use 
of the three options open to him, whereas General BLotto has to deploy his second and third 
defense options at the rate of 2:1. The value of the game, v = 1/3, tells us that the attacker has 
any advantage. 


Linear programming with several objective functions 


Many economic applications involve solving a linear optimization problem with a number of 
competing objective functions in the presence of identical constraints. In cases where no assessment 
of the various objective functions is possible, a compromise solution can be put together from the 


Other games 729 


optimal solutions to these functions, using an arrangement derived from game theory. It follows 
a line of thought whereby s objective functions need to be maximized as z,; = c7x, z2 = ctx, ..., 
z, = c!x in the presence of identical constraints Bx < b, x 2 0. 

Let each of the s linear programming problems be solvable and let the optimal solutions be 
determined, for example, with the simplex algorithm (see Chapter 30.1.) and specified as x; , x2, ...,X;- 
Let all the objective function maxima c{x,, c}x2, ..., cx, be positive. An assessment of the optimal 
solution x; is possible with regard to all s objective functions, where ay, = cj] x;,/c] x,. The following 
then applies: 0 < a, < 1. The percentage fulfillment of the k-th objective function by the optimal 
solution x; is embodied in a, - 100%. This percentage fulfillment may vary considerably, and the 
minimum can be low. The projected compromise solution will not have this disadvantage. Now the 
game is considered according to the pay-off matrix A = (a;,,), i, k = 1,2,...,5, and solved. The 
optimal strategy of the player P,; is x9, and the value of the game is v. 

If xf = Bo » X02, +++) Xos), the following applies. 


athe omprc nise solution k = Xo1¥1 KES ae + XosXs ensures that the minimum fulfill- 
all objective functions i is at a maximum with v « 100%. 


For example, if s = 3 objective functions z,; = 10x, + x2, z2 = 5x, + 5x2, 23 = —5x, 
+ x; + 30 and these are to be maximized with the constraints 5x, + x, = 40, x, + x. = 16, 
XxX, <= 1S and x, = 0, x, > O, then the optimal solutions (see Chapter 30.1.) after the first/second/ 
third objective function are xf = (6, 10), xf = (5, 15) and x4 = (1, 15). 

In the following table the left side gives the values of ce] x,, the centre is the game matrix A = (a,), 
and the right side indicates the values of a, - 100%. 


Ist 2nd 3rd 
objective objective objective 
function function function 


z,/Z z;/22 z,[Z3 


x, 70 80 10 


x, 65 100 15 


80/100 10/40 


65/70 l 15/40 
25/70 80/100] 


100% 80% 25% 


92.86% 100% 2 Bade 
35:11 BOS 100% 


The 10 marked in red (left side) is obtained when the solution x, is substituted into the objective 
function 23: z3(x,) = —5:6+ 10+ 30 = 10. The maximum value of the objective function 
zz is achieved with the optimal solution x 3: z3(x3) = —5-1+ 15 + 30 = 40. The figure a,3 
= c4x,/c4x3 = z3(x,)/z3(x3) = 10/40 provides an assessment of the solution x, with regard to 
the optimal solution x, for the third objective function (centre of the table). The figure a,, - 100% 
= 25° gives a percentage assessment and is at the same time the lowest figure in the right-hand 
table. This means that the third objective function is least fulfilled by the solution x,, the fulfill- 
ment being only 25°. Now that we have defined the game with the pay-off matrix A we can solve 
it using the simplex algorithm. The following is obtained for the value of the game, v, and the 
optimal strategy xo of the row player (yo is irrelevant): 


623 623 ( 504 434 ) ie ( 252 xa) 


= —— » i= ——|0— — Sr OF ea ee 
D= 3g © 90082; 0 = Sag 1° Sas Ga3 469 469 


This leads to the compromise solution 


aaa it 6 ri 252 ( 5 )+ ve a a | ) = ess ki (ae 
e ae.) 469 \15 469 Ce y: 15 )s 15 ) 
in which the minimum fulfillment for all objective functions is at least vp - 100% = 66.42°% 


this minimum fulfillment is a maximum. When & is substitued into the objective function it can be 
seen that the fulfillment of z, is exactly 66.42%, while it is 90.75% for z. and 73.13% for z3. 


Other games 


It is characteristic of matrix games that they involve only two players, each of whom has a finite 
number of options, and that the gains of one side are equalled by the losses of the other. Any of these 
three conditions may not apply, leading to n-person games, infinite games and non-zero-sum games. 

Non-zero-sum games. Here the sum of all pay-offs made during a game is not zero, and the 
players may or may not enter into coalitions. Accordingly, a game may be called cooperative or 
non-cooperative. Parlour games are normally of the latter type, while economic systems are mostly 
cooperative (because arrangements will benefit all sides). 


730 43. Game theory 


For example, a two-person non-zero-sum game can P,| Option! Option 2 
be described by a matrix in which the first figures 
are the gains of FP, and the second figures the gains Py 
of P,, Both players have two options each. 
: pecs E Option | ATE ence 
Option 2 -1: -1 Er 2 


In the non-cooperative version P,; makes no allowance for the interests of P, (the figures in second 
place in the matrix) and sees this as a zero-sum game even though it is not. A graphic solution, 
for example, will show the value of the game for P, (i.e. his average gains per game) to be 1/5, 
and his optimal strategy y} = (3/5, 2/5). 

In a cooperative game, on the other hand, the players may agree only to take actions bearing 
equal numbers. Then the average gains for both will be 3/2 and their optimal strategies 
x§ = (1/2, 1/2) and y@ = (1/2, 1/2). This indicates that the cooperative version is the one from 
which both can expect greater benefits. 


n-person games. Among n players, each has a finite number of options but coalitions are prohibited. 
For each player a separate pay-off matrix A; has been defined which depends on nv arguments (courses 
of action). One game consists of each player selecting a course of action, followed by payment of 
the amount A; to P; . 

These games are normally represented by game trees, and the value of the matrix game gives way 
to the more general concept of the equilibrium point. The strategies associated with an equilibrium 
point are optimal, and if one player abandons the optimal strategy while the others keep to it he 
may not normally expect larger gains but rather a reduction. The equilibrium point and related 
strategies therefore illustrate rational behaviour on the part of the players. The following applies: 
there is at least one equilibrium point in each finite non-cooperative n-person game. 

This existence theorem, which is analogous to the minimax theorem, does not lend itself general 
calculation. Such calculation has so far only been possible in the case of specific three-person and 
four-person games. 


Our example is a specific take-away game with the following rules: from a total of five matches, 
P, must first draw two or three. P,, P; and P; alway take turns, one after another. The game is 
over when no more matches can be drawn, and the last person to take a match will pay the 
amount 1 to the previous player. 

This game can be solved immediately by drawing up a game tree (Fig. 43-2). The figures given in 
the tree are the numbers of matches left at any one time. The line in red is the optimal strategy, 
and a player may expect to gain less if he departs from it. For example, the far left branch indicates 
that P, is the last to draw, and loses. The game is unfair to P,, who will always lose and have to 
pay to P, if his opponents chooses an optimal strategy. 


player's 
move 


fag 


(BS 


43-2 Game tree of a three-person game 


44, Perturbation theory 731 
44, Perturbation theory 


The perturbation calculation is an approximation method which is widely used in science and 
technology. 


In most cases the solution of an equation cannot be indicated explicitly. When the equation is 
slightly modified so that the solution can be indicated, this solution generally differs from the solution 
of the equation originally presented. 


The entirety of methods, the calculation of corrections and the substantiation of these calcula- 
tions is called the perturbation theory. 


Example: Consider the equation x = a + ex* for a given real a and e, |e| small. For e = 0, 
x = a is the solution of the equation. Try to find the solution x in the form of a power series 
x =a+t ex, + €7x2 + ... with still indeterminate coefficients x, , x2, ... 

This power series is substituted into the equation: 


a+ ex; + e?x2 +... = at ela + ex, + 7x, + ...). 


When equating the coefficients at the left side to those coefficients at the right side which are 
at the same powers of e, x, = a*, x. = 2ax, = 2a°,... follows. 


Application to integral equations. Consider for a small |e] the integral equation 
1 
ys) = vols) + © | K(s,1) Wt) dr 
0 


in the interval (0, 1) with given yo(s) and given function K(s,t) of two variables s and ¢, then the 
solution is y(s) = yo(s) for « = 0. The following power series can be considered 


Ws) = yo(s) + ey1(s) + E7y2(s) + ... 
with the functions y;, y2, ... to be determined. 


Substitution into the integral equation and comparison of coefficients result in 
1 1 
yi(s) = ) K(s, t) yo(t) dt, y2(s) = ) K(s, t) yi (t) di, ... 
0 0 


l 1 i 
Example: yo(s) = 1, K(s, t) = st, yi(s) = s| fdr = +s y2(s) = s| t (+ t) df= 
0) o o 

The perturbation theory is applied to eigenvalue problems. 


Eigenvalue problems for matrices. Let Ag and A, be two real and symmetrical matrices with 7 
lines and 7 columns. For a real «, |e| small, consider A(e) = Ap + £A,. 


When do is an eigenvalue of A(O) = Ag, the following is true: If the system of equations 


(Ag — Aol) x = 0 has exactly m linearly independent solutions, then there are exactly m eigenvectors 
xe) of A(e) and m eigenvalues Ae), j = 1, 2,3... with 2,(0) = A> and 


1 forj = 
xe) + xO(e) = os ad 
0 forj+k 


Furthermore it is true that x4(e) and A,(e) are convergent power series in € for |e| < &9, &€ suffi- 
ciently small. 


Example: 
3-1 euite 3 He = 
aor (_; 3)° aes (; 2): Hee: AT Be 
You can start with the power series A(e) = Ag + Aye + Aze* + ... 
and x(c) = x + ex + ... with x = s (1, 1)7. 


Substitution into A(e) x(e) = A(e) x(e) and comparison of coefficients yield 
(Ao = Aol) x) = 0, (Ao =e Aol) x — —A,x©) + Aix, eee 


732 45. The pocket calculator 


A;, i.e. the first correction for 4 can be calculated from the second equation. Since Ao is eigen- 
value the following must be true: (— A, x) + A, x)- x = 0. 


1 /23\ /1\ /1 
1 thi ults 2. = (0). ~(0) — —_ : = 
From this results Ay = Ayx“):+ x 7] (; 4 (;) (;) 5. 


Then x), A;, x, A3, ... can be calculated successively. 


Eigenvalue problems for ordinary differential equations. Oscillations result in eigenvalue problems 
in ordinary and partial differential equations. Frequently, the eigenvalues cannot be indicated ex- 
plicitly. The disturbance theory often provides useful approximations. 

Consider, for example, the eigenvalue problem U’’(x) + eq(x) U(x) + AU(x) = 0 in the interval 
(0, 2) with the boundary conditions u(0) = u(x) = 0 and with the given function g(x), the eigen- 
values for « = 0 are given by /, = n?, n= 1,2, ... The (normed) eigenfunctions belonging to it are 
UL(x) = y(2/) sin (A,x). 

For real é, |e| is small — the set-ups A,(€) = A, + AQMe + ACDe? +... and US = UL) + ULD(x) 

7% 


e + U{?(x)e? + ... with | U2(x) dx = 1 can be made. 
0 


For the first correction of the eigenvalue 


A SS | aU er dx = — £ { q(x) sin? nx dx 
0 0 


is true. 


a 
Example: Let q(x) = x, then Af? = — = [ xsin® nx dx = 1-2. 
0 


45. The pocket calculator 


History. The forerunners of the pocket calculator were on the one hand the mechanical desk 
calculators, which were later enhanced with electromechanical functions, and, on the other, the first 
electronic computers. The use of transistors made it possible in 1962 to manufacture electronic 
calculators that were no bigger than conventional electromechanical types but which could perform 
the same operations. In some cases, the number of digits available on the new machines was smaller, 
but the effect of this was negligible in practice. Soon it was becoming clear that the electronic desk 
calculators were superior: they are quiet, much faster and soon capable of functions that were outside 
the range of electromechanical devices. The first such calculators were, however, quite expensive 
and prone to defects because they consisted of a multitude of components and soldering points. 

Around 1970 it became possible to accommodate the inner workings of a calculator on a small 
and reliable silicon chip which could be mass-produced at a low cost. The result was a pocket-sized 
calculator whose price was steadily reduced in the years that followed, making it an article for mass 
consumption. Programmable calculators appeared around 1975, and some of the more recent 
models have functions similar to those of minicomputers. The pocket calculator has now completely 
superseded mechanical and electromechanical calculators, and the slide rule. Production costs are 
low, it is extremely handy, fast and quiet, and much more accurate than the slide rule, thus elimi- 
nating the need for rough calculation and logarithmic scales. 

Types of calculators. There are calculators with simple and extended functions, others for economic, 
economic/scientific and scientific/technical calculations, programmable calculators, and special- 
purpose calculators. Simple versions can only perform the four fundamental operations, and this 
is enough for many users. These functions can be extended by a memory, a very appropriate addi- 


tion, and basic operations such as percentage calculation ( key), squares and roots ( [x] and 
keys), and inversion ( key). Economic calculators have additional facilities for computing 
interest; scientific calculators can do statistical evaluations (mean values [x], standard deviation 
[co], trend and regression to some extent); and scientific/technical versions incorporate functions for 


exponential and trigonometric calculations and the inversions thereof: [e*],[In x], [Ig x], , 


45. The pocket calculator 733 


From the viewpoint of computing logic there are calculators with algebraic and arithmetic logic, 
with and without brackets, with and without hierarchy and with reverse Polish notation. Characteristic 


of calculators with arithmetic logic is the key (or ), while the key [=] denotes models 
with algebraic logic. Instead of the = key, versions with inverse Polish notation have one with the 


symbol or the word (see below). Bracket keys are marked and [1]. Whether a 


calculator uses a hierarchy or not is not obvious from the keyboard but can be determined with a 
test calculation, based e.g. upon the fact that 14 = 2 + (3: 4) but 20 = (2 + 3): 4. 


Example; Testing a calculator for hierarchy behaviour 


Input Output for calculators Output for calculators 
without hierarchy with hierarchy 


20 | 14 


Characteristic keys on programmable calculators include ; [LOAD], [Lp], , 
and [PROGR]. 


Outer appearance. The display has a maximum of 6—12 digits and a number of special characters, 
particularly signs (minus only) and a decimal point. On the keyboard, the only features that have 
been standardized are the relative positions of the figure keys [o] to [9] and the key for the decimal 
point. The keys for entry, operations, functions and cancel, where demarcation is not always consis- 
tent, are normally given different colours. Some keys have double/multiple lettering/symbols for 
more than one function, and the user has to press a 
prefix key (function key, often F ) to select the par- Example; Entering the number —3.1 
ticular function which, in the case of programmable Input Display 
calculators, also depends on whether the calculator 
is in the loading or execution mode. The only parts 


accessible to the user inside are the batteries or small [3] 
accumulators. ir 

Entering figures. These and the decimal point are L.] 3. 
entered from left to right, the negative sign being i] ce 
added afterwards by pressing the key. For , 
example, —3.1 is entered as follows: st es —3.1 


BLIGE. 

Further input (of figures) is blocked when all the free digits in the display have been taken up. 
Before entry starts the display may show, instead of 0., the result of an earlier calculation, which is 
normally cleared as soon as the first figure is entered. However, there may be figures left in invisible 
registers (memory cells), and these could lead to unpleasant surprises in the further course of the 
calculation. In some cases, all the registers and the display will be in an undefined state when the 


calculator is switched on. It is therefore advisable to activate the cancel key once or the key 
twice to clear all registers. Pressing or once will clear the display and some- 
times only the last digit entered. To cancel a wrong entry, press [cE]; to cancel a wrong negative 
sign activate [+/-] again. 

More sophisticated calculators feature floating- 
point representation. Here, 58-1077 will be 


entered as [7]. the display 
showing 5 8 . 2 7 (or, better, 5 8 . 2 7), but 


the same figure will be displayed in the standard 
form 5.8 28 after activation of an opera- 


tion key (or of [=]). 


Example: Entering the number 58 - 1077 


734 45. The pocket calculator 


For the number — 0.0027 the following or similar input sequences are possible: 
fol[-Ifollolf2}i7] 47-1 or L.Jfolfo}L2IL7] or [2][ 7] [EEX][4] or 
(2IL IL] [o][3] . 


The number entered will then be stored in the ‘‘X register’, and display continues to be linked to 
this throughout the rest of the calculation. However, the figure 0.6666667 x 10-* can be entered as 


[-Is}fe]Lel{s][6][6][7] Lex] -/-] [4] or, at least, as [J[6l[s]L4)(6] (EID) 
. The readout will then only be 0 . 6 6 6 6 - 04 or (standardized) 6 . 6 
6 6 6 — O 5. In the latter case, the X register contains 0.6666667-4 or 6.666667-05, which can 

only be displayed in part. Registers often contain other digits which cannot be directly displayed; 
see also the section on “hidden digits”’. 

Simple operations. To calculate 19 + 85 one has to enter, on most calculators, [1 [[ 9] [=] : 
for which the readout on the display is 1 0 4. To arrive at this result, the calculator goes through 
the following steps: 

The number 19 is first entered into the X register and then copied into a second register, the Y 
register, when the operation key + is pressed. This is where the calculator also remembers the 
addition operation. The X register is now free for the second operand 85 to be entered, and the 
addition proper is performed when the result key [=| is pressed. The sum, 104, is then contained in 
the X register and appears on the display. We can now explain what happens if we enter [1][9] [=] : 
The result key causes the values in the X and Y registers to be added together and the readout is 
3 8. (Some calculators, however, show the result 1 9 for [1][9] [=] , indicating that here the 
first operand is not copied into the Y register until the other has been entered.) No uniform results 


can be expected for the entry [1][9] [=] [2][0] [=] , where some calculators only show the 


20 entered last because the Y register has been cleared after the addition. Other calculators give the 
result 39 (= 19 + 20), and in this case the content 19 + of the Y register has been retained after 
the addition and the second = begins the addition all over again (calculators with automatic constant 
for the first operand). Still other types (with more functions) keep + 85 in the Y register after the first 


addition, so that [1][9] [=] [2][0] [=] gives 1 0 5. This automatic constant for the second 


operand has an advantage when it comes to division. Computing sequences such as 3:57, 7:57, 
11:57 ... which occur frequently in practice can then be entered in the form [=] [=] [=] 
[1] [4] [=| ... (with the [=] key for division and the key for multiplication). In some cal- 


culators, however, the automatic constant for the various fundamental operations is not consistent 
and may sometimes be switched on/off. It is convenient for raising whole numbers to a power: 


3* is then entered as [=] [=] [=] , the result is 8 1. 


Example: Different kinds of automatic constant 


Input X register | Y register Display 
I i Oe em FE | I eae 11 eatrea I) II Ill 

5 Ue 1 
Lo te t9 
19. 19 
g:. 8 
[5] BES fos aee 2 
[=] 104. 104 
2. 2 
20 20 
0 3 9 t 0:5. 


I: calculators without automatic constant, 
II: calculators with automatic constant for the Ist operand, it 
IIL: calculators with automatic constant for the 2nd operand, with algebraic logic in each case 


45. The pocket calculator 735 


Calculators with arithmetic logic perform multiplication and division in the same way as models 
with algebraic logic, but without automatic constant; whereas addition + is coupled with the result 


key [=] as or . To add 19 + 85 one has to enter [1 }f9 |e] [sI[s][=] . Each time the 
key is activated the contents of the X and Y registers will be added into the X register, and 
result then copied into the Y register. For the difference, the key [—] or [=] or [==] has the 


same effect as the combination , so that entering|! [9] 8 | =] gives the result — 66. 
When (1 ][9] [=] is entered the readout on the display will be 66 (—19 + 85 = 66). 
This makes it clear that we are basically dealing with operations that follow the operands. 

or 4. caleulaior win anih: Example: Doing 19 + 85 on a calculator with arithmetic logic 
metic logic, before the input 
process starts the Y register 
must be cleared (press key). 


Display 


1. 

For multiplication and division 
operations must be stored in the cal 19. 
Y register, but not for addition re a a 
and_ subtraction. and =! 1 9: 
[— =] trigger several opera- Si 
tions in quick succession. 85. 
= 85. 
E — 66. 
— 66. 


Calculators with reverse Polish Example: Doing 19 + 85 on an calculator with reverse 
notation consistently use the Polish notation 
principle of subsequent opera- 
tion, and pressing the keys [1] 
(9 +] [8] [5] gives the re- 
sult 104. Entering the separation 
sign (or [ENTER] ) copies 
the entry 19 into the Y register, 
no operation need be stored, and 
starts the operation. 


Hidden digits, overflow. Press [9] [=][7][=]. The accurate result is 1.285714. An eight-digit calculator 
will show this as 1 .2857142o0r1.2857 1 4 3. In the first case the other digits were 


cut out and in the second rounding up took place. Now enter [=] OL IRICSIS)[7) 0) 4) 2] [=] : 
Either the result will be zero, if the calculator has no hidden digits, or else one or two more digits 
will be shown, e.g. 9 . — 0 8, in which case the calculator uses 9 digits internally instead of the 8 
shown, the 9th being correctly rounded up. In other cases where the display might be 8 . — 0 8, 
for example, the 10th digit has been cut. But even if the 9th digit is correctly rounded up in the 


division 9:7, the calculation 1 + 9-10-° — 1 (entry [1] [+] [9 ]TEEX][9] [=] J f=} 
sometimes does not give the rounded up result of zero. In other models, a hidden 10th digit can be 
coaxed out with display 9 . — 0 9, while the intermediate pressing of the result key (entry 


[1] [9] [EEXx][9] [=] [-] ne ) gives the rounded up readout 1. — 0 8. 


When the largest possible number is entered, e.g. 9999999 or 99999 - 1099 and multiplied by 2, 
the calculator will signal an error (E, ERROR, several decimal points, flashing display etc.) or 
show the first digits of the result (1 9 9 9 9 9 9 9) with or withour signalling an error (e.g. missing 
decimal point). For a downward overflow, e.g. [1] [=] (largest number) [=] [=] (largest number) 

[=] the calculator will also show an error or, correctly rounded up, zero. The second alternative 


tells us that, just as in the case of an upward overflow, one cannot have blind faith in pocket calcula- 
tors, and a check is in order from time to time. 


736 45. The pocket calculator 


Percent key. The principal function here is that pressing the key causes the amount in the 
X register to be divided by 100, and one may proceed as follows: 
[=| , where the readout is 1 9 . 5 which is 3% of 650, or 
/\(=| , where the readout is 2 1 6 6 6 . 6 6 7 which means that 100% is 3%, or 
[=| , where the readout is 4 . 615 3 — 01 which explains how much 3 is 


in relation to 650. 


Example: How much is 3 of 650 in percent? 


Input 


Y register Display 


34 
3+ 


3+ 6 
3+ 650 
3+ ia 


0,461538462 3+ 4.6153 -01 


0) SS fe (4) I 


Some calculators also proceed as follows: [6][5][o] [= with the readout 669. 5, 


which is 103 % of 650, or 


(6 ][s][o} [=] [=] with the result 6 3 0 . 5, which is 97% of 650. 


Example: A way to calculate 97° of 650 


Input X register Y register | Display 
[6 | 6 . 
Ei a5. 
650. 
[-] 650. 
ce 
A 195, 
[=] 630.5 
Multiple operations. Let us enter 2 - 3 Example: Calculating 2+ 3 + 4 on a calculator with 
+ 4, It will strike the non-expert as reverse Polish notation 
quite unusual to do this in reverse 2 eke : ; 
Polish notation, but it is indeed very Input X register | Y¥ register Display 


logical to press [4] [4] . 
The [4] need not be preceded by a [+]. 
By entering [4] after the operation 


the intermediate result 6 is to the Y re- 
gister at the same time transfered. After 


the is needed as a separation 
sign to avoid entering 23. 


GE] IE] GIG] fy 


45. The pocket calculator 737 
Almost all calculators with algebraic logic give a correct result when [3][=] [=] 


is entered, and the second [=] can be left out in most cases. Entering leads both to calculation 


of the intermediate result 2-3 = 6, and storage of 6+ in the Y register. This is known as the 
short-cut technique. 


Example: Calculating 2-3 + 4 on a calculator with 
algebraic logic 


Input X register 


Display 


2. 
[x] 21 
[3] ae 
+] Ss 
4. 
[=] 10. 


The picture is less clear when we enter [4]. In this case, most algebraic calculators 
with short-cut facilities will show the intermediate result 5 after is pressed, and the final result 
20. This means that the calculator is performing the operation (2 + 3-4). This can be avoided 
by entering [=] ; the correct result is then 14. Some calculators use brackets, and 
one can enter [=] . By pressing the bracket key (tq. the contents of the 
Y register are transferred to a third, ‘‘Z”’ register. The X and Y registers are then free for transfer 
operations (display shows 0. or the X register is cleared by entering 3 ). After entering 


there is 3 in the Y register and 4 in the X register. When the bracket key is pressed the intermediate 
result 3-4 is calculated and displayed in the X register, then the 2+ stored in the Z register is 


transferred back to the Y register. Entering [=] begins the addition 2 + 12. The number of bracket 
nestings is not standardized, and there may often be 2, 7 or 15. 


Example: Calculation of 2 + 3- 4 using a calculator with algebraic logic and brackets 


Input | X register | Y register £ register Display 
[2] 2 
2 
2. oder QO. 
[3] a 
fe 
[4] 4. 
12 
[=] 14. 


Similar things happen in hierarchic calculators when [=] is entered. As soon 
as is pressed the calculator gives it a higher rank than the that is stored in the Y register. 
This causes 2+ to be transferred to the Z register while 3 x goes into the Y register. The result key 
first connects the Y and X registers, followed by the transfer of the Z register contents to the Y 
register and then by connection of the X and Y registers again. In some cases the Z register is only 
used for addition and subtraction and the Y register just for multiplication and division. 


738 45. The pocket calculator 


Example: Input | X register Y register | Z register Display 
Calculating 2 + 3-4 } | 


using a hierarchic yi] 7 
calculator 
2 2+ 2 
[3] 3 2+ 3. 
3 3x 2+ a 
4 3x yas 4 
[=| | 12 2+ 
14 14 


For inverse Polish notation the entry is . Each not only copies the 


X register into the Y register but also causes the contents that have accumulated in the Y register 
to be transferred to the Z register, and the contents of any following registers to be moved the next 


one, so that whatever was in the previous register is now lost. When is entered, the product is 


written into the X register, and at the same time the contents of the Z register are transferred back 
into the Y register. The same happens with any following registers, the contents of the last one 


being retained. The procedure is similar for : 


Example: X register # register Display 
Calculating 2+ 3-4 


using a calculator 2] oe 

with reverse Polish 

notation. 2%. 
Where a fourth — 

register is not avail- [3] ce 

able, y' =z’ =2 = 

applies; for a calcu- 3. 

lator with exactly | ae 

four registers, yy’ 

=y=z' applies, irae. 

and for one with at = 

least five registers, 14. 


y' =yand z’ =z. 


An algebraic calculator of the hierarchic type or with brackets will easily solve problems of the 
kind 2-3 + 4:5, the entry in the former case being [5] [=] and, in the 
latter,[2] [C] [D] = ; for a calculator with reverse Polish notation the entry 
would be[2|[+ [t{5] while other calculators would need a memory. 

Problems of the kind (2 + 3) - (4 + 5) are insoluble even for hierarchic calculators unless they 
have a memory or additional brackets, but present no difficulties for calculators with brackets or 
reverse Polish notation. 

It may therefore be said that the least amount of rethinking occurs in hierarchic calculators with 
all other functions equal (i.e., entry is as required by the problem, the minus for negative numbers 
follows after); whereas other types have greater capacities, and models with reverse Polish notation 
involve the smallest number of keys. 

Memory. |X — M| or simply [M] will write the contents of the X register into the memory, while 
(read X) writes the contents of the memory back into the X register without clearing the 
memory. The contents of the memory often displayed if they do not equal zero. Some calculators 


have a key which causes the contents of the X register and the memory to be exchanged. 
Pressing or [M—] either adds the contents of the X register to the memory or substracts 
them from it. clears the memory. Where this key is absent an alternative is [o] [M] , but this 
will also clear the X register. The problem 2 -3 + 4-5 + 6-7 can be solved by pressing 


[CM] [2] bx] [3] =] M+) (4) Ex) (5) ] M+] [6] &) Zl M4 [RM]. 


In the absence of M+ one can proceed as follows: 


[2] Gx] (3) &) (MI (4] Gel (5) 4) ERM] =] MILe] ) ZG) IRM] EI. 


45. The pocket calculator 739 


Example: Input X register Y register Memory | Display 
Calculating 
2-344°5+6°7) — [EM] 0 
using M+ — key 9 0 >. 

2 2x 0 2. 
[3] 3 2x 0 aS. 
[=] 6 0 6. 
6 6 6.M 
[4] 4 6 4.M 
4 4x 6 | 4.M 
5] 5 | 4x 6 5. M 
[=] 20 6 20.M 
20 26 20.M 
[6] 6 26 6.M 
6 6x 26 6.M 
7 6x 26 7.M 
42 26 42.M 
(M+ 42 68 42.M 
68 68 68.M 


Example: Input X register Y register Memory Display 
Calculating t 
2-34 4-5 +67 2 2. 
without M+ key ° ye ae 
[3] 3 2x a: 
[=] 6 6. 
[M] 6 6 6.? 
[4] 4 6 4.M 
4 4x 6 4.M 
[5] 5 4x 6 5.M 
20 20+ 6 20.M 
6 20+ 6 6.M 
[=] 26 6 26.M 
[Ml] 26 26 26.M 
26 6.M 
6x 26 6.M 
Gx: 26 7.M 
42 42+ 26 42.M 
[RM] 26 42+ 26 26.M 
[=] 68 26 68.M 


740 45. The pocket calculator 


Simple functions. When function keys such as [+/-], [x2]are pressed (possibly preceded 


by actuating the prefix key F), then normally only the contents of the X register will change; the 
operands in the Y register are retained, thus eliminating the need for a memory. This takes pre- 


cedence over the small saving that would occur in the number of key operations. Pressing 
repeatedly is a way of checking the repetitive accuracy of the calculator, for the numbers must not 


drift away. The key is useful for calculations of the type 2+ 
performed by pressin 


& 
[2 }Lt/<] C+] [3 1) Ge] (4) Ge) EE) Gd. 


Please note that the arguments must ‘ - 
appear in the display, either as a Example: Calculator with algebraic logic, wrong entry 
result or as an entry, before the Entry 

function keys can be actuated. If 


the entry is [1/x][3][=] , the 


1 : 
3414” which can be 


X register 


Y register 


Display 


aM 
result will be 5 instead of the 
2.33333 33 you may have De 
ted. 
expecte Tx] 0.5 
3. 
[=] 5: 


This tells us that all calculators use reverse Polish notation as far as functions are concerned. 
A key which triggers off a nullary function, in the mathematical sense, is [ 7] , which brings z into 


the X register. If there is a key [arc tan x], than z can also be calculated by pressing (4 fx] fh] 
[arc tan x] [=] . The subtraction of the “‘two”’ z tells us something about the accuracy of the cal- 
culator. 

Other functions. Scientific calculators are also equipped with the root function, trigonometric 
functions, exponential functions, the inverse functions thereof and, possibly, hyperbolic functions, 
conversion functions for polar and spherical coordinates, and statistical functions. When these 
function keys are pressed, fixed stored programmes are operated for the approximate calculation 
of the function values, and one has to wait one or two seconds for the result. For example, if the 


entry is [ =][ yx] [=| [=] =] , there will be a remainder which to some extent indicates the 


accuracy of the root programme stored in the calculator. 


Example: Test for the accuracy of calculators root program 


X register Y register Display 


3.14159265 71415997 
1.77245384 1.2724 53:8 
3.14159261 31-41 5:9°276 
=] 3.14159261 3.14159261— | 3.1415926 
[=] 314159265 3.14159261— | 3.1415927 
[=] 0.00000004 -4.-0 8 


This example also shows that the last digits indicated are not always reliable. Nevertheless, a 
test of two calculators solving the problem e?” ({2] [zc] [=] [e*] ) gave the results of 535.49164 and 


535.49165 respectively, which is surprisingly accurate when one considers that the value found in a 
table is 535.491656. The error is greater, however, when a comparison is made between sin 0.0005 
and cos (2/2 — 0.0005), which is the same from the mathematical point of view but which gives 
the results 4.9999999 - 10-* and 5.003681 - 10-*, and 4.9999998 - 10-* and 5.000019 - 10-4 
respectively. But this would be a rather extreme case of course. 

When normal levels of accuracy are required, the calculator completely surpasses the conventional 
tables, and interpolation is no longer necessary. 


45. The pocket calculator 741 


A particular feature is the key which is used to calculate e* - In y so that an error is indicated 
y 


when y is negative. Some calculators have an[x?]key instead (which could simply be the same, or 
else which requires either an exchange in the argument entry or actuation of the register exchange 


key ). In some (but not all) hierarchic calculators, raising to a power takes precedence 
over multiplication and division, and these models have (at least) four registers. 


Example: Calculation of 5 + 4-3? using a calculator with double hierarchy 


Entry X register Y register | Z register T register Display 
5] 5 5. 
[+] 5 I+ is 
4 5+ 4. 
- 4x 5+ 4. 
(3] 3 4x 5+ < 
3 jee 4x 5+ 3. 
2, fpods | 4x 5+ 23 
9 | 4x | 5+ | 

ag 5+ | | 

| 41 | at: 


The unit of measurement is important for trigonometric functions and their inversions, and most 
calculators have a slide control for DEG (degree)/RAD (radian measure) and, possibly, GRD (grade). 
To convert these into one another, one can use a simple trick. If, for example, 40° is to be converted 
into radian measure, one can enter, with the slide control on DEG, [4][0][sin x] , then set the slide 
to RAD and enter arc sin x (also known as ), and the result will be 0.698. An alternative 
is, Of course, [1 ][s][o] [=] [=] ‘ 

* All functions make it necessary to enter the argument first, possibly followed by pressing a prefix 
key and finally the function key. As a rule, the [=| key need not be actuated in these calculations. 

Statistical calculations are possible with some calculators, using the memory and the Z and T 
registers. When an sequence of numbers is entered, separated by [x], one of the three memories 
stores the number of entries (which can be recalled with [7]), the second stores the sum (this can 


be recalled with ), and the third contains the quadratic sum ([=x?] ). One can then recall the 


ae _ x — x)? —x) . 
average & —- with x, and the standard deviations / an and 2 ioe ‘with[on-1] 
— n 
and [o,|, and sometimes with the square of one of those numbers too. 


Several addressable memories. Where several memories are present the contents of the X register, 


for example, will be stored in memory 4 by pressing [4]; with [RCL] (recall) the number 
can be returned to the X register. Several stores are useful in such application to statistical evaluation 


n 
When a sequence of numbers such as @;, a2, ..., dj, ... Occurs One can store the partial sums & a; 
n i=1 
in memory 1, the related square sums & a? in memory 2, and possibly count the summands in 
i=1 
memory 3 (7); all three memories are required to calculate the scatter for the sequence of numbers. 
Sometimes a computing plan is useful, as is shown here for solving the quadratic equation 
x? + px +q=0. 
In this way one can establish small programme libraries which will be particularly useful for 
programmable calculators. 


742 45. The pocket calculator 


Example: Plan for solving a quadratic equation x* + px + q = 0 


STOIL] 
RO) Ie GS 
srolLs] 

A ROE 
STO] 
RGIS 

RaOwIS 


Programmable calculators. These have an additional programme memory with a capacity ranging 
from about 30 to more than 1,000 steps. For example, the abovementioned programme for calculating 
the roots of quadratic equations can be as shown in the following. It is assumed that the coefficients 
p and q are in the memories 1 and 2. 


—p/2 


I A vA && WN = 


Calculators with several memories have 
root functions 


oo 


First solution 


10 Second solution 


Example: A program for solving a quadradic equation 
{LOAD | | Transition to programming mode 


Set instruction counter to zero 


[a] | The individual instructions are accompanied in the display by 
| a check computation or a code number for the key or combi- 
nation of keys that has just been depressed, and by a serial 

number shown left here) 


[>] 


& 
DPegeoqoagqosag 
a E 


45. The pocket calculator 743 


To display the value for x, 


The value for x3 is indicated 


If you want to run the programme, make sure that the values for p and g are in memories | and 2. 
(Alternatively, one can use the and instructions to schedule a programme so that 
the calculator stops for the entry of certain values and then proceeds to store these automatically 
after a certain period or when the key is actuated). If this is the case, press (to 
set the instruction counter from 22 to 00) and then [RUN] . The calculator will then run as if the 
user had pressed the keys numbered 00 to 14. It will then stop so that x, can be read off. When 
is pressed again, the programme continues through steps 16-20. At that stage the calculator 
comes to a final stop, and the value for x, can be read off. In our example, the instruction counter 


is reset to 00 after |STOP| but not after [HALT], so that pressing [RUN] again would repeat 


the calculation. Prior to that, one can change p and g in memories 1 and 2. 

It will now be clear that a small programme such as this consisting of 22 steps utilizes a considerable 
part of the programme store in a small programmable calculator. It is therefore desirable for the 
programme store to have a minimum of 100 locations. Our programme will be slightly shorter if 
we use reverse Polish notation. Not all calculators will indicate situations where the capacity of the 
programme store has been exceeded before overwriting it from the beginning. 

It is important to check the accuracy of the programme entered and to test it thoroughly before 


use. For this purpose one can press the key (single step) repeatedly to run the programme 
sections individually, which will be particularly necessary in cases where a faulty programme would 
run endlessly. One can then interrupt the programme with [STOP] and find the defect with [SST] ' 


In the course of time, the user will establish a small programme library, and the fact that switching 
the calculator off erases all programme and data stores is then all the more regrettable. It means that 
programme and data in question must once again be entered and tested the next time. In the case 
of longer programme, this is laborious, and errors cannot always be avoided. Some calculators 
therefore have hold circuits, which preserve the contents of programme and data stores for a certain 
length of time (possibly until the batteries need changing). Other models can print out the programme 
and data on small permanent magnetic cards, from which they can be read in again at a later date 
without any errors or loss of time. In some cases cassette recorders can be connected as permanent 
memories. 

In the above programme, that which is stored is merely a sequence of key operations, the only 


true programme instructions being |HALT]| and [STOP]. In most programmable calculators one 


needs a greater diversity of contents and a larger capacity. For example, such commands as “‘if 
x > 0, go to nn, otherwise carry out the next instruction”’. 

Let us consider an example where a certain amount of money (debts) yields 4.5% interest and is 
repaid at a rate of 1.5% per year plus the saving in interest payments. The question could then be, 
how many years are needed to repay the debt. The result (32 years) appears in the X register at the 
end of the operation and is shown in the display. The amount of money initially owed is irrelevant 
and is therefore assumed to be 1. 

Often jumps to subprogrammes are possible. These are jumps to particular programme sections 
together with jumps back to the next instruction after “jump to subprogramme”’. The calculator 
must therefore remember the number of this follow-up instruction: the return address. 

From the viewpoint of computer science, the programming of programmable calculators is 
carried out in a code that is very close to machine language. There are also calculators with facilities 
for programming in a high-level language (BASIC), but these are closer to microcomputers under 
functional aspects and are therefore not dealt with here separately. 


744 


I] 


12 


45. The pocket calculator 


Example: A program to calculate the time of repayment for a debt 


Initial sum 


The money owed at a particular time is in memory 1 


6° of the initial debt = annual repayment (annuity) 


Annuity in memory 2 


Annual counter 

Debts to X register 

Test to see if contents of the X register (remaining debt) are 
still positive 

If so, i.e. x < 0, continue with step 30 (result output), other- 
wise proceed to the next step (13). The destination address 30 
cannot be inserted until the rest of the program is complete, 


or a large enough number is selected to be on the safe side 
(this is suitable also for modifications). 


Debt multiplied by 1.045 = debt plus interest 


Debt plus interest minus annuity = new debt in memory 1 


Annual counter set one year forward 


46. Microcomputers 745 


28 | {GOTO | Unconditional (return) jump 

29 ne to step 11 to test and select program cycle repeat or result 
output 

30 Result output 


31 


Program running, final display: 32 


In some calculators the steps 23 to 27 can be summarized. 


46. Microcomputers 


Historical background. The microcomputer, on the one hand, is a logical step in the development 
of microelectronics which has already given us pocket calculators, particularly of the programmable 
type and, on the other hand, embodies a variety of concepts and system architectures as realized 
especially in third generation computers. 

The first integrated circuit (one transistor, three resistors, one capacitor on one germanium chip) 
was put together in a laboratory in 1958 and commercial sales started in 1962 (with eight transistors 
on one silicon chip). In 1971, one chip carrying 2,300 transistors represented the first central pro- 
cessing unit of a functional if not very versatile computer. The 4-bit microprocessor was born in 
which a group of 4 bits (one tetrad) is the smallest information unit that is processed at a time. Hence 
the frequently used classification as n-bit computers (with nm = 8, 16, 32). As the integration of 
circuits increased (reaching 8,000 to 10,000 transistors per chip after 1974), both the capacity and 
functions of computers grew at a rapid pace. 

All the concepts developed up to the third generation were again applied to these much smaller 
systems, and new technical features were incorporated. 

In 1975 the first set of components became available for a small home-built computer, followed 
in 1977 by the first complete microcomputer. The decade thereafter saw an unprecedented massive 
spread of these systems amounting to a genuine scientific and technical revolution. No longer need 
one carry a problem (after preparation) to a computer (that is not always accessible in a computer 
centre where it requires special attendance), but the computer (with all its accessories) can be ‘‘close 
to the problem’’ because 


— systems are small (500 g to 5 kg for desk top models), 

— have a low power consumption (< 25 W), 

— work at ever-increasing speeds, 

— continue to go down in price, and 

— are extremely versatile (in conjunction with a variety of electronic assemblies). 


Computer systems have penetrated all spheres of daily life, on and off the job. The microcomputers 
of our time defy an exact classification as to design or performance, leaving only the broad criteria 
of use as home computers and personal computers. 

Construction — This is a useful way of differentiating between systems. 

The pocket computer is a direct extension of the programmable pocket calculator but no longer 
key-programmed. It uses a higher-level programming language instead (which is often BASIQ), 
is portable and receives its power from different types of batteries. 

The video computer works in conjunction with other home appliances (such as TV sets and cassette 
recorders) and (in the simplest configuration) consists of the computer proper and a keyboard 
(similar to that of a typewriter). It is mains-connected and programmable in at least one programming 
language. 

The personal computer (workplace/office computer) is an autonomous system and has its own 
screen, auxiliary stores (often with floppy-disk connection), printer etc. for a growing range of 
professional uses. 


746 46. Microcomputers 


Video and personal computers are both expandable and configurable for use with many other 
devices such as memory extension assemblies, joysticks, light pens, bar code readers, interfaces for 
process signals, voice input and output devices, telephone transmission facilities etc. 

Setup and functions. Let us consider a minimum configuration consisting of a computer, keyboard, 
cassette recorder and TV set with the following functions: 


— through the keyboard the user can communicate with the computer and load a program/data 
(for the first time) or enter control statements; 

— programs (stored/to be processed) can be shown on the TV screen, input data can be checked and 
results (output data) displayed; in dialogue-oriented programs (mostly games) the screen is a 
work area; most systems also have an acoustic output (a beep in the simple versions but also 
sound patterns from built-in synthesizers heard over the TV loudspeaker); 

— the cassette recorder permits to input programs which are stored on (commercial) cassettes and 
to output programs and data from the memory on cassettes; these may be vital for the operation 
of the system (software), as in an operating system. They may also serve as an interpreter for a 
higher-level programming language or as application programs assisting the user in learning, 
playing, etc; 

— the computer as such controls the system and all information processing. 


These components are physically there and represent the hardware of the system. 

Once a decision has been made for a model to be purchased one must (despite all standardization) 
make sure that the parts of the system are compatible especially if existing devices (TV set, recorder) 
are to be included in the configuration. This saves time and money. It is equally important to be 
aware of the demands to be made on the system that is being set up (purchased), and whether it 
is sufficiently flexible (and reasonably priced) for extension. Basic components are often quite cheap 
but adding on may then become much more expensive than a higher initial investment. It is recom- 
mendable to get as much advice as possible before making a start. When all components for the 
system have been acquired, read the instructions for mounting and installation carefully. Many 
disappointments can be avoided with an accurate and planned approach (minimum distance between 
the TV set and cassette recorder, switching-on sequence, etc.). Plug-and-socket connections can be 
marked, free-hanging cables should be avoided (at home). Give the equipment the attendance and 
maintenance it needs (clean contacts, keyboards etc.). 

Whether you use “‘only”’ prepared software (programs) which is fully sufficient for many purposes, 
or whether you write your own programs (which, of course, can be quite fascinating), observe the 
logic of interaction between your key components. 

Even in the minimum configuration, the hardware components can come into four functional 
categories: 


a) the central processor with ROM and RAM, 
b) input devices, 

c) output devices, 

d) auxiliary memories. 


The interaction of your hardware, all types of input and output and the processing of data (char- 
acters) in the computer are controlled by software using a variety of programs for different jobs. 

The operating system is a key element of your software and absolutely necessary for operating 
the system regardless of the user’s intentions. At the heart of it is a ROM (Read Only Memory) 
which controls the system after start-up. 

Using a RESET function (a key in most cases), the user can always return to this point and start 
anew in a defined manner. At the worst, if all keys and functions fail and the computer has ‘“‘ crashed’’, 
switch it off and on again. These crashes may be caused by faulty machine programs. 

It is absolutely necessary to know the functions of the operating system. In many cases (in the 
simpler configurations) the whole system is accomodated in the ROM. More comfortable setups 
(especially where floppy-disk systems are connected) have facilities for loading other parts of the 
operating system into the RAM (Random-Access Memory). 

All the information in the ROM is preserved when the computer is switched off and remains 
available. It can not, however, be modified by the user. The RAM, on the other hand, is a read- 
write memory to be filled with programs and data (sometimes automatically) when the system has 
been started. With instructions at system level the user can expand the capacity of his system. In the 
simplest case, when using application software, the respective programs can be loaded into the main 
data memory (LOAD), started (RUN) and filed in external memories after completion (or possible 
modifications) (SAVE). The system core often has programs to check the correct function of the 
system (AUTOTEST). The more efficient and comfortable the selected computer system, the more 
comprehensive and sophisticated its operating system. The recommendation can only be repeated 
here that a close study should be made of the available functions. One can then use prepared software 
such as programs for games, writing systems for daily correspondence etc. 


46. Microcomputers 747 


The efficiency of a system increases with a printer (available in many types) and the connection 
of floppy-disk systems. Floppy disks are flexible plastic disks with a magnetic coating permanently 
encased in a plastic envelope which give random access to large amounts of information. 

By connecting printers and floppy-disk systems the transition is made to office and personal 
computers. Even though the user need not concern himself with the internal functions of a mini- 
computer system and the way the hardware operates, it is worthwhile picturing the arrangement of 
the RAM. 

It consists of a systematic sequence of memory cells numbered from 0 to N-1, mostly with a width 
of 8 bits and then called bytes. 


0. byte 1. byte (N-2). byte (N-1). byte 
[7{s[sfa[3|2{ifo} 7fo[si4[sf2frjop... [7]----fof 7 [-- [eo | 


Often the bits within a byte are numbered from 0 to 7 from right to left. Since each bit can only 
by 0 or 1, exactly 28 = 256 different binary combinations are possible for the contents of a byte. 

If the 2! is assigned to the digit i of a byte, then its contents can be directly interpreted as a binary 
number. 

Quite often the content of a byte is interpreted as two consecutive hexadecimal digits by combining 
4 bits into a tetrad and assigning the value 16° to the right tetrad and the value 16’ to the left tetrad. 


Example: Representation as decimal number of the byte contents read as binary number: 
Binary number Decimal number 


0000 0000 0 

0000 000 | 

0000 0010 2 

0000 0011 3 

0000 0100 4 

0000 1111 15 

1111 0000 240 

1111 0001 241 

11 1111 255 

Binary number Hexadecimal Conversion Decimal number 

representation 

0000 0000 0 0 0.16" + 0.16° 0 
0101 0001 5 1 5.161 + 1.16° 81 
1001 1001 9 9 9.161 + 9.16° 153 


Since the numbers 10, 11 ... 15 can also be represented by 4 bits, six additional digits are required 
for the hexadecimal representation which are assigned in the following way: 


748 46. Microcomputers 


Example: The hexadecimal number 
EB = 14-16! + 11-169 = 1-27 +1-2°+1-25+0°24+1-23 +0-27+1-2'+41-2° 
= 235 
corresponds to the binary number 1110 1011. 


From these considerations it can be seen that it does not matter at all whether the binary number 
is directly converted into the appertaining decimal equivalent as sum of binary powers or split 
up into two tetrads and the conversion performed hexadecimally. Quite often the hexadecimal 
representation is only a short and easy description of the binary contents of a byte without special 
numerical references. 


Example: The representations 38, AB, 7C describe the byte contents 


0011 1000 1010 1011 O111 1100 
2 8 A B 7 Cc 


This representation is frequently used. 

The interpretation of the contents of a byte as character is of fundamental importance. This 
assignment is called code and in most cases ASCII (American Standard Code for Jnformation 
Interchange) is used in microcomputers, see Table. 


Coding to ASCII 
Hexadecimal Character Hexadecimal Character Hexadecimal Character 
| code code code 


4B 
4C 
4D 


20 Space 5A 
21 ! 5B 
22 5C 
23 


} cot ar Pl 


| 


Leora Oe eS 


Vea Ka eS eho OR Br MnO VO er 


HS TOMMOOU>@~v ac 


/ 
0 
l 
2 
3 
4 
45 
6 
7 
8 
9 
T 
U 
V 
WwW 
x 
Y 


Unprintable control characters (such as cursor motions on the screen) which are not uniform 
on microcomputers and which often vary from the ASCII Standard correspond to the hexadecimal 
values 00 to 1F. The values 7F to FF are frequently used for coding graphic symbols and other 
control characters (e.g. for the colours). 


46. Microcomputers 749 


With a code of this kind, the basis for text processing has been created which is amazing for the 
first moment. The input, handling, storage, and output of texts is a handling of bytes and byte 
contents inside the computer. At the same time, it can be seen that the representation and processing 
of numbers is a special part of this possibility of processing characters. 

These considerations also show how efficient the assemblies must be which are arranged between 
keyboard and memory or between memory and printer. Operation of a certain key must cause the 
transfer of the corresponding byte contents, the availability of byte contents, the drive of a certain 
print character etc. 

Memory sizes are indicated in KBytes (i.e. Kilo-Bytes): 

1 KByte = 4 pages = 1024 bytes, 
1 page = 256 bytes, 

l byte = 8 bits, 

1 word = 2 bytes = 16 bits. 


When 2 bytes (16 bits) are used for numbering (addressing) a memory, 64 KBytes can be directly 
addressed. For example, the capacity of a floppy-disk is about 640 KBytes. 

Programming of microcomputers. The final form of representing a program to be read by the 
computer is the machine language. Instruction formats, which are exactly determined and typical 
of the respective computer, show which instruction has to be executed, viz., which operations are 
applied to which operands. The set of all available instructions is the instruction list. 


Each instruction contains 


Both the operation code and the operand part are sequences of bits and occupy together a certain 
number of bytes. 

The first possibility of programming is to immediately enter the bit combinations required into 
the corresponding memory cells. This type of programming is very expensive, error-prone and 
hardly used at present (not at all by beginners). 

The socalled assembly languages are of greater use. The operations are coded by mnemonic 
operation codes, addresses can be represented by symbols; moreover, auxiliary means for program 
organization (use of constants, fixing of start addresses, definition of memory areas) are specified. 


Example: A part of the program in a An operand OPI! is loaded into an arithmetic 
possible assembly language: register (accumulator A), and then moved into a 
LDA OP! register B (MOVE !). Another operand OP2 is put 
MOV B.A into the accumulator and a summation of register B 
LDA OP? and accumulator is made (ADD !). When the result 
ADD B is zero, the instruction is continued with the address 
JZ ZERO ZERO (JUMP ZERO) etc. 
ZERO: ... 


This language level also requires extensive knowledge of the function and design of the hardware. 
This type of programming will (still) be applied when time-critical sequences must be performed 
having optimal design. Higher-level programming languages frequently offer the possibility to 
include parts which have been set up in assembler language. 

The fact that conversion into the machine language of a program written in the assembly language 
is automatically executed by the system is importent for the user. For this purpose, a translation 
program (compiler, in this case assembler) which transforms the symbolic representation into the 
machine language should be available in the operating system or at least loadable. This is done 
before the actual running of the program. If a program is required more frequently, it could be 
advisable to store both the assembler text (for possible corrections) and the program in machine 
language and to keep it. 

The use of higher-level programming languages is the “‘most agreeable” form of programming 
for the user. 

Here the range is extremely wide and is continuously extended. At present, mainly BASIC and 
PASCAL are of importance for microcomputers. Other interesting languages, mainly those with 
methodical and didactic intentions, must be reserved for more special representations. 

The difference to the level of the assembly language is obvious when expressing in PASCAL 
and BASIC the instruction sequence indicated in the assembly language in the last example. 


750 46. Microcomputers 


Example: Addition of two numbers and checking the result 


A: = OP! + OP2; 10 LET A = OPI + OP2 
if A = 0 then goto 10; 20 if A = O then goto 100 
10::.:; 100 ... 
PASCAL BASIC 


The names occurring here (A, OP1, OP2) are substitutional for the contents of certain memory 
locations, the operational symbol + designates the addition; the symbol: = or LET ... = des- 
ignates the destination of the result. The meaning of the second instruction directly results from 
the translation of the English keywords 


if A = 0, then goto ... 


A program text in such a form can be handled much more easily by the user than machine and 
assembly languages and should always be the starting point for one’s own software efforts. The 
transformation of higher-level programming languages to the machine level is possible by means 
of compilers and interpreters. 

Compiling operating mode. An interpreter processes the original text instruction by instruction. 
It specifies an equivalent sequence of machine instructions for each instruction, executes them and 
processes the next instruction. A compiler processes the complete program text and establishes a 
complete machine program which is usable completely independent of the source text. 


Machine program | Execution of the 
for the whole prob-|_4] complete machine 
lem program | program 


Instruction | | 
Instruction 2 
Instruction 3 


Problem program 


Interpreting operating mode 


| Instruction 1 
Instruction 2 
| Instruction 3 
* 


Machine program 
for one instruction 


Execution of the machine 
program for one 
| instruction 


Problem program 


Generally, compilers are used for PASCAL, both compilers and interpreters are available for 
BASIC, however, preference is given to interpreters for microcomputers. 

Thus another important system component is given which can be considered a part of the operating 
system for example, the BASIC interpreter can be available as software (loadable from cassette or 
floppy-disk) or on an (additional) slip-on ROM. When working with a certain program, you proceed 
from the system level to the language level. 

Independent of the decision taken for a programming language, it can be seen that after some more 
or less successful attempts it is absolutely necessary to proceed systematically, carefully, disciplined 
and deliberately in the program design to succeed as quickly and safely as possible. For this purpose 
the whole process of problem solution, the design of the program text, is subdivided into single 
operational steps which should be executed completely and correctly. Any incorrectness, omission 
and negligence within a step are frequently of serious negative consequences for the subsequent 
stages. 

The following should be performed in the correct sequence: 


1. Problem. The task must be clearly defined without any misinterpretation; both the aims to 
be achieved and all conditions and prerequisites should be established. 

2. Modelling. The problem is studied so thoroughly that there are starting points for the investi- 
gation of solving the problem and for solution. Very often this will be formulae for solutions (for 
mathematical problems), but also search methods, specifications for text and image configuration 
etc. are suitable. Moreover, data types and structures which are adapted to the problem must be 


46. Microcomputers 751 


selected. They play a major role in the effective solution of a problem. Of course, the availability of 
the respective structures in the case in question must already be taken into account (knowledge of 
the corresponding language version !). 

3. Description of the input and output data. The following attributes for all input and output data 
must be defined: 


Identifier: Name for the data object used in the program 

Meaning: Description of the object which is represented by the data object 
Unit of measurement: Unit of measurement of the object 

Type: Type of the data object 

Structure: Structure of the data object 


Example: Data object for performance 


Identifier Meaning Unit of Type Structure 
measurement 
P performance kWh real number single variable 


4. Organization of input and output. Detailed statements on the sequence of input and the design 
of the output must be made: 


a) Fixing of the objects for input, or a) Fixing of the objects for output, 
b) Representation of the values, b) Configuration of the (printed or screen) image, 
c) Sequence of the data objects, c) Sequence of data objects. 


5. Program and data design. After elaborating steps 1 to 4 the actual program design can be carried 
out which uses a top-down strategy. The progress from a verbal description of the block solution 
through various refinement steps up to the ready program text is understood by this. The basic 
conception is to decompose complex steps (operations, tasks) into a number of “‘smaller”’’ partial 
steps (operations, tasks) whose execution as a whole is functionally equivalent to the subdivided 
problem. 

It is important to select the subdivisions (refinements) in such a way that the functional equiv- 
alence is not lost, but on the other hand to make a successive approximation to the terms of the 
target language. An idea of the available elementary operations and functions of the target language 
is necessary; the more exact and complete the knowledge is, the more effectively programming can 
be performed. If not all possibilities of the language are known you will certainly give away something, 
and if you know the language only insufficiently this can result in the tasks not being solved. Re- 
semblance to the degree of mastering a foreign language is not accidental and illustrates these 
problems very impressively. 

Elements of BASIC programming. In this section, important elements of the programming language 
BASIC (Beginners’ All — Purpose Symbolic /nstruction Code) are shown which, at present, has 
become a dominating language in all microcomputers and, since the sixties, one of the most popular 
languages. Its main advantage is that the most important terms can be learned easily which mainly 
enables beginners to establish their own small programs ‘‘ quickly”. Furthermore, it can be extended 
easily; even operations which are oriented to the colours of the screen, the generation of sounds, 
the hardware and especially to backing memories can be easily included in the language. 

At the same time this advantage is a great disadvantage since it resulted in a great number of 
‘* dialects”? which are incompatible. Like in other programming languages, increased standardization 
efforts are being made for the BASIC language. 

A main problem when using BASIC for greater problems is that the structured programming 
is not assisted or enforced by the language. A careful study of the methodology of programming and 
its consistent optional application is required. 

Data and data type. The basis for understanding programming languages and their possibilities 
is the use of names (designations) which can be chosen by the user and allow access to memory 
locations. A memory location is reserved for each name chosen without the necessity to care for 
the specific computer-internal realization. The content of the memory location is accessible exactly 
via this name. 

All names occuring in a program can be considered as being compiled to form a storage block. 

A distinction is made between constants which remain permanently unchanged during the execution 
of this program and variables which can (perpetually) be changed. 


Example: A storage block of names Sum Pi N I St 


Five variables are used, Pi and N remaining 314 100 
constant, while sum, I and St are the variables. PET rt 


752 46. Microcomputers 


A certain data type is connected with each of these variables and even characteristics admissible 
for the respective variable. Important numerical data in BASIC are integer or real data. Integer 
data are integral numbers which must lie within a certain range of values and can be used in the 
normal way. Real data are used for real numbers with a finite number of places (i.e. exactly for 
rational numbers) and are also subject to certain specifications (they must be represented in a certain 
way, be within an interval, ...). 

A certain number of admissible operations (operators) belongs to each data type. They are partly 
self-evident and must partly be tried by the user when studying the respective BASIC manual. 
Important operations for numerical data are addition, subtraction, division, multiplication, raising 
to a power and operations for the comparison of quantities. Operators always influence data and 
link them to a certain result. To achieve the definite use of the operators and to exclude brackets 
in the formulation of operation sequences, a certain order for the use of the operators is specified. 
The following specification is applicable for arithmetic operators: 

1. Execution of raising to a power, 
2. Execution of multiplication or division, 
3. Execution of addition or subtraction. 

Operations of the same kind and of the same stage are processed from left to right in their order 
of occurrence. 

Operands and operators are compiled to correct terms and can be used accordingly. 

Storage locations correspond to the variables occurring there which, at the time of execution 


of the program line, must in any case (!) contain a value. Many program errors can be attributed 
to the missing initialization of variables. 


Example: Assume that the following BASIC line is given: 
Xl =(A*A—2+4A*B + Be B)/(A*A + Be B) 
First, the term right of the equality symbol is considered, and evaluated in the following steps: 
_ AeA 
2#A+*B 
BeB 
AetA—2*A«#B 
(A*A—2*A+*B+ B#B) 
Ata 
BeB 
. (A*« A + BB) 
9. (A*+A—24A+*B+ Be B)/(A*A + B* B) 


SONIA WN 


The function of the equality symbol is noteworthy for the variables to the left of it. An assignment 
statement is characterized by the equality symbol: the value of the term to the right of it becomes 
the value of the left variables. In the last example, the value which has been calculated by means 
of steps 1. to 9. is transported to place X1 and filed there (under this name). 


SUM = 0 are also simple assignment statements in this sense. The evaluation of 
PI = 3.14 the constants directly results in the indicated value and this value is 
N = 100 stored. In this way, the initialization of variables can be carried out. 


The compilation of data to vectors, matrices, ..., which are jointly designated field is very easy- 
to-use and handy. A special instruction DIM is used for the assignment of storage locations. 


Example: The definitions of fields 

DIM A(3, 3) 

DIM B(10) 

DIM C(2, 5) 
give rise to the assignment of a matrix A with 
3 rows and 3 columns, of a vector B with 
10 components and of a matrix C with 2 rows 
and 5 columns. 


46. Microcomputers 753 


The use of strings as data types is very typical of BASIC. Sequences of characters enclosed in 
quotation marks are understood by this. Variables for strings are specially marked by placing the 
character $ behind. 


Example: 


A § — “BASIC” 
B $ = “DESCRIPTION” 


gives rise to the assignment of the indicated sequence of characters to the corresponding variables. 
For example, a typical operation for strings is their chaining. 

C$=A$+B$ 
results in the value “‘BASICDESCRIPTION” for C §. 


The operations available should be carefully studied and used in their entirety. 

The use of predefined functions and the definition of own functions is very comfortable and 
expressive. While the latter requires some exercise and knowledge (and can be implemented by 
the DEF FN instruction) predefined functions can clearly be used with their name and the corre- 
sponding arguments. 


Example: 
X1 = —P/2 + SQR (P « P/4 — Q) 
X2 = —P/2 — SQR (P « P/4 — Q) 
These two instructions calculate the roots of a quadratic equation x* + px — g = 0 in case 
p7/4 — q = 0. SQR designates the root function, the argument follows in brackets and is an 
arithmetic term. The programmer himself must take care that p*/4 — g is actually not negative. 


The BASIC system can also be used as an (expensive, comfortable) pocket computer. Strictly 
speaking, this means that in the command mode (direct mode) each instruction is immediately 
executed after termination (by a special key which is frequently designated ENTER). 

PRINT SQR (5) ENTER 
PRINT SIN (P1/2) ENTER 
PRINT 3*5%*7%*9 ENTER 


results in the corresponding value being calculated and shown on the screen (PRINT) after each 
ENTER. This possibility will, of course, be used only seldom. The common application is the 
program mode. 

These are the specifications to be observed: 


1. The text of a BASIC program consists of a sequence of lines. 

2. Each line starts with a line number; then the actual instruction follows, no number can be used 
twice. 

3. The order of the lines is performed in accordance with the numbering chosen and in an ascending 
sequence. 


A numbering of the lines in tens steps is useful for programming. If lines are added later on 
there is a reserve of 9 lines each which are frequently sufficient for this purpose. After this the required 
order (in tens steps) can automatically be restored by a special instruction (RENUMBER). 

For the program text, imagine a storage block divided into squares in which each square contains 
a BASIC instruction. The interpreting mode already described opens a square (in accordance with 
the numbering) and realizes the specification contained in it (by access to the memory of variables, 
to the peripheral devices, ...). 

The user must consider the basic principles shown and study the BASIC description thoroughly 
to realize the mode of action of all operations available. Then you will be able to understand written 
programs, to work them through manually (to interpret them) and to move over to writing one’s 
own programs. The following example demonstrates the steps required. 


Example: The computer shall calculate 14 1 n : 
the mean value and the spread ofaseries  ™ = — > a, call | eR > (m— a) 
of measured values. ote = 


n= 100 
Rough design of the algorithm Data 
(1) 1. Input request and input of n n integer 
2. Field input field A for 100 real numbers 
3. Computation m m real number 
4, Computation si st real number 
5. Output m, st 


754 46. Microcomputers 


Refined design of the algorithm Data 
(II) 1.1. Print: ‘Input: ...’ 
1.2. Input: 7 
1.3. Initial values $1, S2 51, 52 real numbers 
2. Repeat for 7 from 1 ton i integer control variable 
2.1. Input: ai 
2.2. Compute: S1 = S1 + ai 
3. Compute: m= Sl/n 
4.1. Repeat for i from 1 ton 


S2 = S82 + (mm — ai)’2 


4.2. Compute: st = V1/(n — 1)-$2 | 
5. Print: BASIC text 
5.1. ‘The mean value is: ’, m 10 WINDOW: CLS 
5.2. “The spread is: ’, st 20 PRINT “Program mean value and spread” 
: 30 PRINT “OCOUTPPEEEEEOTFEEE EES EEEES 
Generally, the study of the available lan- eee” 
guage elements can be carried out accord- 49 DIM A(100): ! Specify a field of 100 elements 
ing to the following instruction groups: 50 INPUT ‘How many values will you input?”; 7 
1. Mathematical operations 60 PRINT “Input the measured values!” 
2. Assignment operations 70 Su=0 
3. Logical operations 80 FOR i= 1 TOn 
4, Comparison operations 90 INPUT A(i) 
5. Program loops 100 Su = Su + Ali) 
6. Mathematical functions 110 NEXT i 
7, Character string functions 120 m = Su/n 
8. Input of data 130 Su = 0 
9. Output of data 140 FOR i= 1TOn 
10. Data file operation 150 Su = Su + (m — A(i))*2 
(operation with external data carriers) 160 NEXT i 
11. Graphics and screen control 170 st = SQR(Su/(n — 1)) 
12. Access to hardware 180 PRINT “The mean value is”; m;”. 
13. Subroutine technology 190 PRINT “The spread was”; st; **.” 
14. Sundries 200 END 


The available language elements of group 1. to 9. are absolutely necessary. The following table 
must be considered the core of a BASIC Standard. It was defined in 1978 by the American National 
Standards Institute. 


BASIC Standard 


Instruction Meaning 
or character 


ne Identification of the beginning or end of a character string 
| Identification of a variable as character string variable 
lee Decimal point in REAL numbers 
E Identification of the exponent in REAL numbers (e.g. 3.55 E-6 for the 
number 3.55 - 10-®° = 0.00000355) | 
: Separation of instructions in a line 
END Identification of the physical end of the program 


REM Start of comment line; the text up to the end of the line has only an expla- 
natory meaning and is omitted in the program execution 

DATA Definition of a sequence of constants which can be assigned to certain 
variables by the READ instruction 

INPUT Identification of a data input (from the console) 

READ Reading of data from a DATA instruction, assignment of the values to 
certain variables 

RESTORE Setting of a pointer to the first element of the first DATA list; READ in- 
structions always read the value to which the pointer points; after each 
reading operation the pointer is advanced 
Abbreviation of PRINT 


LOG (x) 
RANDOMIZE 


RND 

SIN (x) 

SOR 

TAN (x) 

DEF FN ... 
FEND or FNEND 
| LEN 

GOTO ... 

| IF ... THEN ... 
ELSE ... 


IF ... GOTO ... 
BASE 

DIM 

OPTION BASE 
CALL ... 
GOSUB 


RETURN 
FOR: 105: 
SPEP::.. 


NEXT: =. 


46. Microcomputers 755 


Display of data on the screen; in this connection format controls are 
possible (PRINT A; B; C or PRINT A, B, C —- PRINT AT —- PRINT 
USING — PRINT + — PRINT # USING) 

Separator in data lists, in PRINT instructions the leave blank of a tabulator 
step is effected 

The next output is immediately connected to the previous output if there 
is a semicolon between the variables to be output 

The output is performed at the point indicated (TAB (19) effects the output 
as Item 10) | 
Character for multiplication 

Character for addition 

Character for substraction 

Character for division 

Character for raising to a power 

Character for “less than”’’ (A < B) checks whether the relation “less than” 
is applicable (the occurrence of A < B is answered with “Yes” or “No” 
at a certain point) 

Character for “less than or equal to” 

Character for “not equal to” 

Character for “greater than” 

Character for “greater than or equal to” 

Character for assignment statement 

Identification of an assignment (can be omitted in most cases - LET A = B 
corresponds to A = B) 

conditional assignment, the assignment behind LET is realized if the condi- 
tion behind IF is met 

|x| 

arctan (x) 

cos x 

e* 

[x], the greatest integer =x 

log x 

Fixing of a start value for the generation of arandom number sequence with 
the RND instruction 

Generation of a (pseudo) random number in the range between 0 and 1 
sin x 

yx 

tan x 

Definition of a user-owned function which can be called later on with FN ... 
Identification of the end of the functional definition 

Determination of the number of characters in a character string 

Jump to the indicated line number 

Branching in the program; if the condition after 
IF has been met, the instruction behind THEN is executed, otherwise the 
instruction behind ELSE. ELSE and the subsequent instruction can be mis- 
sing; then a jump to the next line is made 

conditional jump 

defines whether counting starts at 0 or | in indices of fields 

Definition of fields 

see BASE 

Branching to a machine program starting at a certain address | 
Branching to a subroutine; execution of this program until a RETURN 
instruction occurs; this subroutine is then terminated and a jump to the | 
line following GOSUB is made 

Leaving a subroutine, return to the (calling) main routine | 
Running instruction; the assignment statement which is written behind | 
FOR assigns an initial value to a running variable; the final value is written 
behind TO; all instructions up to NEXT are repeated for each value of the 
running variables between the initial and the final value; the running 
variable is always increased by | if STEP has been omitted; if another step 
size is required, it can be indicated behind STEP. 


756 46. Microcomputers 


Other instructions require more profound knowledge and should be carefully studied. In particular, 
hardware and graphics-oriented instructions can only be used efficiently if you know their purpose, 
and then they are a great attraction. Finally, it should be noted that every good BASIC system does 
not only include a compiler or interpreter, but represents a complete programming system. While 
the instructions described are means of expression which effect certain actions within a program, 
the level can be raised by means of other commands and effectively support the process of pro- 
gramming and the handling of the program in interactive communication with the system. 

The following functions which can be controlled by commands are available: 


— Input of programs (CLOAD, ...), 

— Correction of programs (BREAK, EDIT, ...), 

— Display of programs (LIST, ...), 

— Execution of programs (RUN, ...), 

— Operation with secondary data memories (CLOAD, CSAVE, ...), 
— Support of program testing (TRACE, ...), 

— Indication and alteration of memory mapping (TOP, ...), 

— Erasure of the main memory (ERASE, NEW, ...). 


Finally, there is an instruction (BYE) by which you can leave the BASIC system and return to 
the operating system. 


Subject index 


A 


a, are (unit of area) 164 
abbreviated calculation 36 
Abel, theorem on series 483 
Abelian group 342, 344f. 
abscissa 283 

absolute convergence 394 
— error 608 

— geometry 713 

— polarity 716 

— rational number 71 f., 74 
— term 86, 92, 97 

— value 73, 78 

absorption rule in lattice 678 
abstract group 346 

— space 705 

abstraction I 1 


accumulation point of sequence 


387 f. 

accuracy problem in measure- 
ments 631 f. 

— test by congruences 26 

activity 691 

— -oriented network 691 

actual infinite 719 

— value 139 

acute angle 148f. 

— triangle 155 

Adam’s method 637 

addition, abbreviated 36 

— of algebraic sums 41f. 

— — convergent series 394 

— — fractions 31f. 

— — integers 21f., 70 

— law in probability theory 
579f. 

— in linear space 706 

— of matrices 373 

— method 89 

— of order types 330 

— — rational numbers 31 

— system 19 

— theorem for binomial 
coefficients 484 

— theorems of trigonometric 
functions 233f. 

— of vectors 363 

additive number theory 673 

adherent to set of points 684 

adjacent angles 150 

— side in trigonometry 222 

adjoint transformation 372 

adjunction of root 355 

adjustment of data 614-624 

— — relations 621 ff. 

aeq function 333 ff. 

affine connection of manifold 
574 

— differential geometry 572 


affine distortion 544f. 

—- parallel coordinates 551 

— transformation 534f., 572 

affinity, perspective 208 f. 

Aitken’s interpolation 635 

algebra, fundamental theorem 
of 101, 528 

—, Hilbert 680 

—, linear 356ff. 

—, multilinear 380 

—, vector 362 ff. 

—s 679f. 

algebraic complement 361 

— curve 676 

— equation 80ff., 351 ff. 

— —, product representation 
101, 72/7, 351 

— extension 351 f. 

— geometry 675f. 

— inequality 103f. 

— number 673f. 

— structure 679f. 

— sum 28f., 40 ff. 

— surface 676 

— variety 675f. 

algebraically closed domain &0 

algorithm 14, 340, 719 

—, Euclid’s 25, 46 

—, linear equation 85 

—, non-numerical 342 

— of Gauss 358f. 

alidade 253 

almost all 384 

alternate angles 151 f. 

alternating cross sum 26 

— current 499 

— group 344 

— plane 207 

— sequence 381 

— series 393 

alternative of Fredholm 704 

— hypothesis €01 

altitude (astronomy) 277 

— theorem 167, 170 

— of triangle 158f., 271 f. 

always convergent 482 

amplitude, complex number 78 

—, polar coordinate 284f. 

— of sine 236 

— type of quadrature formula 
636 

anaglyph method 220 

analysis, combinatorial 575f. 

—, complex 80, 517-529 

—, functional 705 ff. 

—, numerical 630ff. 

—, vector 475f. 

analytic continuation 527 

— geometry 15, 282-319, 
530-547 


analytic number theory 672f. 

angle 148 ff., 159f., 186, 366 

—, direction angle of section 
288 f. 

—, exterior 151, 155f., 159f. 

—, interior 151, 155f., 159f. 

—, right 148, 242 

—, trisection of 154, 356 

— between lines 538f. 

— — planes 541f. 

— — vectors 366f. 

— bisector 159, 177f., 295 

— of intersection 293f., 302, 
310f. 

— — parallax 254 

— -preserving mapping 524 

—-— projection 255 

— in R" 369 

— on solids 186 

angles, Euler 535 

angular frequency 236f. 

annuity 143f. 

annulus, area of 174f. 

anomaly, eccentric 317 

anticommutativity 366 

antilogarithm 65 

antisymmetric relation 323 

apex 155 

Apollonius, theorem of 297 

applied mathematics 14 

approximate value 608 

approximation, binomial 493 f. 

—, parabolic 628 

—, variation 702 

— of continued fraction 76 

— by least squares 626 

— method of variation 702 

— by polynomial 626 

— of real number 74f. 

— theory 624ff. 

— —, functional analysis 711 

arbelos, area of 174f. 

arc (unit) 149 

—,element of in surface 567 

— of circle 171 

— function 133 

— length 468f., 482, 563 

Arccos, arc function 133f., 230 

Arccot, arc function 134, 230 

Archimedean axiom 71, 73 

— order 72 

— solid 197 

— spiral 443 

architect’s arrangement 217 

Arcsin, arc function 133f., 230 

Arctan, arc function 134, 230 

are, a (unit of area) 164 

area (unit of area) 164 

—,element of 567 


758 Subject index 


area, measurement of 164f., 
174f., 295f. 

—, transformation of 167f. 

—, units of 164 

— of circle 174f. 

— — ellipse 181, 452 

— — parallelogram 165, 366 

— — polygon 166, 296 

— problem 444 

— of quadrilateral 250 

sector 174 

segment 174 

spherical triangle 263 

trapezium 452 

triangle 165, 249, 295f. 

argument, complex number 78 

—, polar coordinate 284 

— of function 108 

— type of quadrature formula 

636 

Aries, first point of 278 

arithmetic mean 106, 382 

— sequence 382 

— series 389f. 

arithmetical function 325 

— operation SOf., 56, 71 

— — on intervals 633 

arithmetization of language 721 

arity 334 

Ars iudicandi 341 

— inveniendi 341 

ASCII 748 

assignment statement 752 

associative law of addition in 
C 78 


—_—_ 


in N 20, 70, 78 
in Q 40 
in R 78 
in vector space 


— in Z 28 

associative law of multipli- 
cation in C 78 

in N 22, 70 

in Q 40, 72 

in R 78 

in vector space 


in Z 29 

— — for set operations 322 

associativity in lattice 678 

asymmetric relation 323 

asymptote of hyperbola 181f., 
307 

— — rational function 127 

asymptotic cone 347 

— expansion 625 

— point 443 

— representation 625f. 

atomic expression 334 

automorphism 347 

axial moment of inertia 474 

— symmetry 152, 226 

axiom 12 


axiom, Archimedean 71, 73 

—, Hausdorff’s 686 

— of choice 321 

— — field 349 

— — group 344 

— — parallel 12, 712 

—s of betweenness 713 

—s — congruence 713 

—s — continuity 713 

—s — incidence 713 

—s for metric 706f. 

—s of motion 714 

—s — order 713 

—s — Peano 13, 70, 335 

axiom system 13 

— —, probability theory 582 

axiomatic characterization of 
Euclidean geometry 712f. 

— definition of probability 
581 f. 

— system of set theory 321 

— theories, incompleteness of 
720f. 

axiomatizable theory 342 

axis of affinity 208f. 

— — conic 179ff., 303 ff. 

— — quadric 543 ff. 

— — symmetry 152 

axonometry 214 ff. 

azimuth 256, 277 


B 


backward section. 259 

b-adic system 19 

Banach fixed point theorem 711 

barometric height formula 59, 
506 

base of number system 19 

BASIC 751 ff. 

— standard 754f. 

basic points for approximation 
626 

— solution, feasible 655 

basis of induction 70 

— — logarithmic system 57f. 

— — power 48 

— — vector space 349, 367 

bearing 273 

Bellman’s principle of opti- 
mality 665 

bell-shaped curve 592 

belt theorem, Crofton’s 575 

bench mark system 257f. 

bending of surface 566ff. 

Bernoulli and L’Hospital, rule 
of 400f. 

— differential equation 507f. 

— numbers 487 

—’s inequality 106, 387 

Bernstein’s theorem 327 

Bessel function 513 

—’s differential equation 513 


betweenness, axioms of 713 

bilinear form 380 

— function 369 

— operation 365 

billion 19 

binary system 19 

binomial 40 

— approximation 493f. 

— coefficients 43 

— —, addition theorem 484 

— congruence 671 

— distribution 586f. 

— formulae 42 

— integral 462 

— series 490f., 493f. 

— theorem 43 

binormal to curve 563f. 

biometry 607 

biquadratic equation 101 

bird’s-eye view 215 

bisection 153 

bisector of angle 177f. 

—, perperidicular 153, 157, 177 

biunique relation 323 

body, convex 574 

—, planar 186f. 

Bolzano-Weierstrass theorem 
388 

Bolzano’s theorem 405 

Borel field of events 581 

bound for error 608 

boundary condition 515 

— point of figure 684 

— value, differential equation 
515 

— — problem 697 

bounded function 112 

— linear operator 709 

— sequence 382 

bounding value, method of 610 

brachystochrone problem 699 

brackets, dissolution of 41 

Braille 577 

Brianchon hexagon 561 

Brianchon’s theorem 560f. 

bridge of graph 690 

Brouwer’s fixed point theorem 
683, 685 

buffer time 692 

Buffon’s needle problem 574 

bundle of lines 185, 539 

— — planes 186 

Bunyakovskii inequality 368 

byte 747 


C 


C, complex number 13, 77f., 
517ff. 

calculable function 340 

calculating disc 69 

calculator, pocket 732 ff. 

—, programmable 742f. 


caiculation with limits 385, 398 

— of resources 692 

calculus, differential 406-443 

—, integral 443-479 

—, propositional 333 

— of differences 633f. 

— — errors 607-614 

— — reflections 714 

— — variations 698-702 

cancellation 30f., 45f. 

canonical homomorphism 347 

Cantor universe 718 

cap, spherical 198 

capacity 187 

— of supply 658 

Cardano’s formula 98f. 

cardinal number 17, 327f. 

—§, well-ordering of 331 f. 

cardinality 13 

cardioid 440 

carrier of pencil 147 

Cartesian coordinates 283f., 
530 

— normal form of equation of 
line 289 

— product 325, 328 

Cassinian ovals 438 

casus irreducibilis 98 

catenary 436 

Cauchy-Hadamard formula 483 

— principal value 522 

— -Riemann differential equa- 
tions 523 

— -Schwarz inequality 106, 
368 f. 

— sequence 75, 707 

— test for convergence 387 

—’s form of remainder 488 

*s inequality 708 

—’s integral formula 520 

—’s — theorem 520 

cavalier perspective 215 

Cavalieri’s principle 191f., 
467f. 

cavity (topology) 681 

Cayley’s theorem 348 

celestial equator 277f. 

— meridian 276f. 

— pole 277f. 

centimetre 147 

central collineation 555f. 

— divided differences 634f. 

— equation of ellipse 30S5f. 

— — — hyperbola 307 

— image 203f. 

— limit theorem 593 

— perspective 216 ff. 

— projection 203f., 216f., 
547f., 556 

— surface 544 ff. 

— symmetry 152, 226 

centre of curvature 436 

— — gravity 158, 297 

— — mass, coordinates of 473f. 


centre of perspective 
217f., 302f. 


— — projection 203f., 217f, 


302f., 548 
— — symmetry 152 
certain event 578f. 
Ceva, theorem of 298 
chain, Markov 594 
—, Sturm 123 
—, triangular 257 


— rule in differentiation 414f. 


— — of logarithms 58 


change of coordinates 284f., 


377f., 532f. 
channel (topology) 681 


characteristic equation 510, 647 


— function of subset 328 
— of logarithm 58, 61 


— — perspective affinity 208 


— polynomial 379 


— rules of vector space 362 
— system of partial differential 


equations 693 


— value of linear transforma- 


tion 378 
chart, control 606 
— of intercept 651 


Chebyshev’s inequality 585f. 
— law of large numbers 586 


y?-distribution €02f., 605 
(table) 

choice, axiom of 321 

chord of circle 171 ff., 242 

— — sphere 198 

— theorem 175f. 

Church’s hypothesis 341 

circle 157, 159, 171 ff., 242, 
248, 299f. 

—, equation of 299f. 

—,involute of 437 

—, normal to 300 

—, Spherical 262, 269 

—, squaring of 154, 356 

—, tangent to 171f., 301 

—, topological 689 

— of curvature 435 

— — latitude 571 

circuit, Hamilton 690 

circular cone 193 

— cylinder 190 

— function 133, 230ff. 

— -, derivative of 419 


circulation of vector field 478 


circumcentre 157 
circumference of circle 171 
— — ellipse 482 
circumpolar stars 276 


circumscribed circle of ellipse 


181 
— — — n-gon 162 
— — — triangle 157 
cissoid 44] 
class, ideal 674 
— of combination 577 


Subject index 759 


class of structures 339 

—es — partition 324 

—es — ordinals 330f. 

closed path of graph 689 

— set of points 684 

cm, centimetre 147 

CNW (combination network) 
692 

Codazzi-Mainardi formulae 569 

codification of data 342 

— — language 721 

coefficient 350 

—, binomial 43 

—, undetermined 130, 460, 
486 

— of correlation 600 

— — regression 599, 623 

— — superposition 633 

cofactor 361, 375 

cofunction 227 

coincidence plane 206 

collinear points 147 

collinearity 713 

collineation 555f. 

—, perspective 218 f. 

— nomogram 650 

column of matrix 3€0, 373 

combination 577f. 

— of mappings 326 

— network (CNW) 692 

combinatorial analysis 575f. 

— structure 689 

commensurability 169 

common cycloid 439f. 

— epicycloid 439 

— fraction 12, 31f., 34 

— logarithm 58, 60ff. 

commutative group 344 

— law of addition in C 78 

— — — — inN 20, 70, 78 

— — — — in@ 40 

— inR 78 

— — — — in vector space 


— — — — inZ28 

— law of multiplication in 
Cc 78 

— — — — in N 22, 70 

— — — — in@ 40, 72 

— — — — inR78 

— — — set operation 322 

commutativity in lattice 678 

comparison of mean values 
603f. 

— test for sequences 392 

— of variances 604 

compiler 750 

complementary angle 150 

— event 579 

complete graph 689 

— metric space 707 

— quadrangle 554 

— quadrilateral 554 

— system of events 579 


760 Subject index 


completeness of Euclidean 
geometry 713 

complex analysis 80, 517-529 

— curvilinear integral 518 

— number 13, 77f., 517ff. 

— partial differentiation 519f. 

— -valued function 517 ff. 

component of force 246, 694 

— — segment in space 535 

— — vector 364f. 

componentwise differentiation 
475 

composite function 111 

composition of mappings 326 

— — operators 709 

compound interest 141f. 

computational symbol 28 

computer 745 

concave polygon 162f. 

— quadrilateral 160 

concepts, definition of 719f. 

conchoid 442 

conclusion 334, 337 

condition of integrability, 
differential equation 508f. 

conditional convergence 395 

— observation 619f. 

— probability 580 

cone 193 

—, asymptotic 547 

—, frustum of 195 

—, generating line of 302 

—, intersection with line 
(descriptive geometry) 211 

—,— — plane 302ff. 

conformal geometry 572 

— mapping 524f., 527 

— projection 255 

congruence, axioms of 713 

—, binomial 671 

— in hyperbolic geometry 716 

— of ideals 670 

— — integers 26 

— — triangles 156f. 

conic 178 ff., 302 ff., 556ff. 

—, construction of 559, 561 

—, enveloped by lines 183, 412 
560 

—, intersection with line 
309 ff. 

—, normal to 311f. 

——, parameter of 314 

—, polar equation 316f. 

—, singular 302, 557 

—, tangent to 309f. 

—, vertex equation 314f. 

— equation, discussion of 319 

— section 178 ff., 302 ff. 

conjugate diameters 180, 209 

— field 352 

— potential functions 698 

— of root 674 

connected graph 689 

— point set 683 


connected relation 323 

connectivity (topology) 680f. 

conoid 202 

conservative vector field 476 

consistent equation 82 

— estimate 600 

— inequality 104f. 

constant, estimation of 622 

constraint 654 

constructability, effective 
(logic) 719 

construction, basic 153f. 

— by ruler and compass 151, 
154, 355f. | 

content, Peano-Jordan 687 

continued fraction 76f. 

continuity, axioms of 713 

continuous decision process 668 

— figure (topology) 681 

— map 681, 684 

— proportion 38 

— random variable 582 

continuum 328f., 332 

— hypothesis 329, 332, 722 

contour integration 522f. 

— plan 214 

contracted cycloid 439 

— epicycloid 439 

— hypocycloid 440 

— ordinate values 118, 133 

— parabola 118 

contradiction, principle of 334 

contragredient matrix 374 

— transformation 377 

contraposition 334 

control chart 606 

convergence 384 ff., 389ff., 
480ff., 521 

—, disc of 521 

—, tests for 391f. 

— of meridian (astronomy) 
255f. 

— — sequence 384ff. 

— — series 389ff., 480ff. 

convergent iteration 639 

conversion of number systems 
632 

convex body 574 

— function 661 

— linear combination 655 

— optimization 661 

— polygon 162f. 

— polyhedron 574, 655 

— quadrilateral 160 

coordinate system 283, 285f., 
530ff., 549 

coordinates 256, 283f., 364f., 
377, 530ff., 548f., 556f. 

—, transformation of 284f., 
377, 533f. 

— in geometric axiomatics 
715 

— of vector 364f. 


coordination problem (optimi- 
zation) 660 

coprime 25, 46 

correlation 600 

— function 594 

correspondence 107f., 110 

corresponding addition and 
subtraction 39 

— angles 151, 159f. 

cos, cosine 221 

cosec, cosecant 222 

coset of subgroup 346f. 

cosh, hyperbolic cosine 134 

cosh~!, inverse hyperbolic 
cosine 135 

cosine, direction 533 ff. 

— curve, half waves of 499 

— integral 491 

— rule 245, 264f. 

cost function 640 

cot, cotangent 221 f. 

coth, hyperbolic cotangent 134 

coth™!, inverse hyperbolic 
cotangent 135 

countable set 327f. 

coupling distance 692 

CPM (critical path method) 
692 

Cramer’s rule 361 

critical activity (network) 691 f. 

— path method (CPM) 692 

Crofton’s belt theorem 575 

cross elevation 212 

— product 36Sf. 

— -ratio 297, 550, 716 

— sum 25f. 

— —, alternating 26 

cubature 466f. 

cube (power) 51, 68 

— (solid) 187f. 

— root 52, 68 

cubic content 187 

— equation 97ff., 355 

— function 118 

— graph 690 

— term 97 

cubical parabola 118 

cuboid 187f. 

culmination 276f. 

curl of vector field 478 

cursor of slide rule 66 

curvature, integral 570 

—, line of 571 

— of plane curve 117, 426, 
434 ff. 

— — space curve 564f. 

— — surface 568f., 571 f. 

— vector 564 

curve, algebraic 676 

—, bell-shaped 592 

—, length of 469, 563 

—, rectangular 498 

—, Shortest 567 

—, simple closed 682f 


curve, triangular 498 

— in plane 109ff., 431 ff., 
437ff., 448 ff., 469 

— of second class 558 

— — — order 558 

— in space 561-565 

— theorem, Jordan’s 682 

curved scale 649 

curvilinear integral, complex 518 

Cusanus, construction of 194 

cusp 434 

cut of rational numbers 75 

cutting plane method (optimi- 
zation) 659 

cyclic group 345 

— quadrilateral 176, 250 

— subgroup 348 

cycloid 439f. 

cycloidal pendulum 511 

cyclotomic equation 163, 674 

cylinder 190 

— functions 513 

cylindrical coordinates 533 

— polar coordinates 466, 
532f. 


D 


damped oscillation 236 

Dandelin sphere 303 

data type 751 

decadic logarithms 57f., 62f. 

— system 19 

decagon, regular 163 

decidable theory 342, 713 

decimal fraction 33f., 74 

— geometric sequence 384 

— system 19 

decision problem 341 

— process, continuous 668 

declination (astronomy) 277f. 

— (surveying) 256 

decomposition into primes 46 

decreasing divided differences 
628, 634 

definability 339 

definite integral 444 ff. 

definitely divergent sequence 
385 

definition (logic) 719f. 

degenerate conic /79, 182, 
302, 557 

— quadric 543 

degree (unit of angle) 149 

— of algebraic equation 83 

— — differential equation 501 

— — extension field 349 

— — freedom of distribution 
602f. 

— — meridian 273 

— — polynomial 120 

-decomposition method (opti- 
mization) 658 


Delic problem 154 

deltoid 162 

de Moivre-Laplace theorem 
(probability) 593 

de Moivre’s formula 79 

de Morgan’s rules 322 

denominator 30ff., 55, 76 

density of rational numbers 72 

departure in Gauss-Kriger 
coordinates 256 

dependence, linear 90f., 367 

dependent variable 109 

depth line (descriptive geo- 
metry) 205 

derivability 338 

derivative 407—423 

— of integral 420 

Desargues, theorem of 553, 715 

Descartes, folium of 434 

—’ rule of signs 122 

descriptive geometry 15, 
203-220 

design of experiment 595, 632 

determinant 360ff. 

determination of roots 638 

deterministic problem (opti- 
mization) 654 

— process, discrete 664 

developable surface 546 

deviation, mean-square 597 

—, standard 585, 597, 616f. 

diagonal 160ff., 189 

diagonalizable transformation 
378 

diagram of function 108 ff. 

diameter of circle 171 

— — ellipse 179 

— — ellipsoid 544 

— — sphere 198 

—s, conjugate 180, 209 

dichotomic search 641 

die, ideal 579 

diet problem 654 

difference of integers 21 

— — power Series 485f. 

— quotient 407 

— sequence 117 

— of sets 322 

— table 635 

— of trigonometric functions 
235 

—s, calculus of 633f. 

—s, divided 633 ff. 

differentiable manifold 348 

differential 411f., 421 f. 

— calculus 406-443 

— —, chain rule 414f. 

— —, mean value theorem 411 

— equation, partial 693 

—- —, perturbed 511f. 

— — of Bernoulli 507f. 

— — — Bessel 513 

— — — Euler 700 

— — — Gauss 513 


Subject index 761 


differential equation of Hamil- 
ton-Jacobi 693 

— — — Laplace 523f. 

— —g — Cauchy-Riemann 
523 

— —sgs, ordinary 500-517 

— —s, system of 516f. 

— geometry 15, 561 ff., 572 

— operator 636 

differentiation, complex 519f. 

—, component-wise 475 

—, implicit 417 

—, logarithmic 415 

—, numerical 636 

—, parametric form 416 

—, polar coordinates 417 

—, power series 481 

—, term-by-term 481, 485 

—, vector 475 

digit 19, 36, 609 

—, hidden 735 

digon, spherical 198 

dimension of point set 683 

— — prime ideal 677 

— — vector space 368 

dimetric skew axonometry 215 

Diophantine equation 671 

directed graph 688 

direction, feasible 

— angle 256ff., 288 

— cosine 533ff., 

— field 502f. 

— vector 537, 540f. 

direct measurement 618 

directly proportional 37 

directrix of parabola 183, 303f. 

Dirichlet’s condition (Fourier 
series) 497 

— theorem on primes 673 

— — — units 674 

disc 171 

— of convergence 521 

discontinuity, jump 224, 403 

—,removable 403 

— of oscillatory function 404 

— — random events 582 

— — trigonometric function 
223 f. 

discount 142 

discrete deterministic process 
664 

— random variable 582 

discriminant of quadratic 
equation 93 

discussion of curve 431 f. 

displacement, parallel in a sur- 
face 567f. 

distance between points 286, 
685 

— circle in central projection 
204 

— from line 538 

— — plane 541 

distortion 205 


762 Subject index 


distortion, affine 544f. 
distribution, binomial 586f. 
—, x7 602f. 

—, frequency 596 

—, Gaussian 584, 589f., 598 
—, F- 602f. 

—, line diagram of 573 


—, normal 584, 589f., 598, 614 


—, Poisson 588f. 

—, t- 602f. 

—s, testing of 605 

— function 582f. 

— of test variables 602f. 

distributive law in C 78 

— — — N 23, 70, 78 

— — — @40 

— — — R78 

— — for set operations 322 

— laws in ring 679 

— — — vector space 362f. 

divergence, theorem of Gauss 
478 

— of vector field 477f. 

divergent iteration 639 

— sequence 385 

— series 389f. 

divided differences 628, 633 ff. 

dividend 22 

divisibility 23, 25f., 669 

division 22 

—, abbreviated 36 

— by algebraic sum 44 

— of complex numbers 77 

— by fraction 33 

— of integers 22, 70, 670 

— — powers 49 

— by power series 486f. 

— of segment 170, 204, 286f., 
536 

divisor 22 

dodecahedron 196 

domain 80f., 108f., 111, 323, 
520f., 684 

—, algebraically closed 80 

—, fundamental 81 

—, multiply-connected 520 

—, simply-connected 520 

— of definition 81 

— — function 108 

— — relation 323 

— — variability 40, 81ff. 

dot product 365f., 368f. 

double compass method 208 

— entry of table 231 

— fraction 33, 46 

— integral 463f. 

— point 434 

— scale 649 

doubling of cube 154, 356 


doubly-periodic function 528f. 


dual geometrical object 553, 
555, 575 

— isomorphism 678 

— lattice 678 


dual parametric problem (opti- 


mitation) 660 
— space of vector space 709 
duality, principle of 553f. 
— of polyhedra 197 
— in space 555 
— theorem (optimization) 
656f. 
duodecimal system 19 
dyadic system 19 
dynamic optimization 664 


E 


e 58, 132, 387, 492, 675 

earth, form of (geodesy) 272 

eccentric anomaly 317 

eccentricity, linear 179 ff., 
305 ff. 

—, numerical 303 ff. 

ecircle 159 

ecliptic 279 

edge of graph 688 

— -angle 186 

— of solid 186ff. 


effective constructability (logic) 


719 
— estimate 600 
eigenfunction 515, 704 
eigenspace 378 
eigenvalue 378, 515, 704 
— equation 647 
— problem 647, 731f. 
eigenvector 378, 647 
element of arc 469, 567 
— — area 567 
— — determinant 360 


elementarily integrable differen- 


tial equation S5O05ff. 
elementary event 581 
— geometry 15 
— language 334ff. 
— number theory 23ff., 46 
— theory 342 
Elements of Euklid 711 f. 
elevation 205 
eleven test 26 
elimination, solution by 89, 
358f. 
— of Jordan 644 
ellipse 178 ff., 209f., 304 ff. 
—, arc length of 482 
—, area of 181, 452 
—, axes 179f., 305f. 
—, circle of curvature 180 
—, circumference of 482 
—, curvature 180 
—, degenerate 179 
—, diameter of 179 
—, equation of 305 ff. 
—, leading circle of 181 
—, linear eccentricity 179, 305 
—, normal to 311 


ellipse, numerical eccentricity 
304, 306 

—, parametric representation 
307, 482 

— perspective affine image 
209 

—, polar equation 316 

—, tangent to 181, 209, 309f. 

—, two circle construction of 
179, 210 

—, vertices 179f., 305 

ellipsoid 544, 569 

—, diameter of 544 

elliptic differential geometry 
572 

— geometry 717 

— integral 462, 528f., 529 

— paraboloid 544f. 

— point 568 

entry od determinant 360 

— — matrix 373 

envelope 183, 437, 503f. 

Ephemeris time 280 

epicycloid 439 

equality of difference 73 

— sign 81 

equation 80ff. 

—, algebraic 83, 35If. 

—, biquadratic 101 

—, characteristic 510, 647 

—, consistent 82, 90 

—, cubic 97ff., 355 

—, cyclotomic 163, 674 

—, Diophantine 671 

—, fractional 87 

—, integral 703f. 

—, Laplace’s 695 

—, linear 86ff., 356ff., 644 ff. 

—, Pell’s 672 

—, Poisson’s 695 

—, quadratic 92ff., 103, 318f. 

—, quartic 100 

—, root of 101f. 

—, solution by radicals 102, 
354 

—, solution set of 81 ff. 

—, Soreau’s 650 

—, telegraph 696 

—, transcendental 83 

—, trigonometric 237 ff. 

—, universally valid 82 

— of function 109 

— — heat conduction 696 

— — line 287ff., 537f., 552 

— — plane 539f. 

— — time (E. T.) 280 

—s, linear system of 89f., 
357f., 644 ff. 

equator, celestial 277f. 

— of earth 272 

equatorial system 277f. 

equidistant basic points 
(approximation) 629, 635 

equilateral hyperbola 182, 287 


equilateral triangle 155f., 249 

equilibrium point 730 

equinox 278 f., 279 

equipotent sets 13, 17, 327f. 

equipotential surface 695 

equivalence, logical 337 

— of ideals 674 

— relation 323f. 

equivalent equations 83 ff. 

— expressions 40, 81, 84 

— fractions 30ff. 

— inequalities 104f. 

— -perspective affinity 209 

— transformations 104 

Eratosthenes, sieve of 24 

Erlangen programme, Klein’s 
572 

error, calculus of 607-614 

—, propagation of, Gauss’s 
law 617 

—, rounding 630 

—, systematic (statistics) 595 

— of calculation 610 

— — first or second kind 
(testing) 602 

— integral, Gauss’s 491, 591 f., 
614, 625 

— of measurement 613 

— — observation 613 

— — procedure 631 

escribed circle 159 

essential singularity 521 

estimate of error 600, 711 

estimation, most probable 615 

—, Statistical 600f. 

— of accuracy 631 

— — constant 622 

— -— mean error 617 

E. T. (equation of time) 280 

et function 333 ff. 

Euclid’s algorithm 25, 46 

— Elements 711f. 

— parallel postulate 712 

— theorem on triangle 167 

Euclidean geometry 11, 146ff., 
342, 712f. 

— —, decidability of 342, 713 

— plane 548 

— vector space 369 

Euler angles 535 

— -Fourier formulae 497 

— method 637 

— multiplier 509 

— path 689 

— sum formula 625 

Euler’s constant 492 

— differential equation 699f. 

— formulae 133 

— gamma function 450 

— g-function 670 

— g-theorem 670 

— polyhedron theorem 196 

— rotation theorem 535 


48 Mathematics 


Euler’s spherical triangle 262f., 
267f. 

even function 112, 226 

event 578 ff. 

— -oriented network 691 

evolute 436 

exact differential equation 508 

excess, spherical 263, 266 

excluded middle, principle of 
332 

exhaustion, method of 12, 444 

exp, exponential function 131 

expansion, asymptotic 625 

expectation, maximum 661 

— Of error distribution 614 

— — random variable 583f. 

— value 661 

experiment, design of 595, 632 

explicit definition of curve 431 f. 

— function 109ff. 

exponent 48 ff., 59 

exponential function 131f., 399, 
418, 451, 489,519 | 

exponentiation 47ff., 70 

expression 40, 81 ff., 104f., 333, 
334 

extended cycloid 439 

— epicycloid 439 

— hypocycloid 440 

extension, algebraic 351 f. 

— factor, e. f. 32, 45 

— field 349 ff. 

— of fraction 30f., 45f. 

extensionality, principle of 321, 
332 

exterior angle 151, 155 

external common tangents 173 

— division of segment 170 

extrapolation 633, 637 

extreme value, search for 640ff. 

— — of function 424 ff. 

extremum of function 424 ff., 
433 

eye distance 217 


F 


face-angle 186 

— -diagonal 189 

— of polyhedron 186f. 

factor 22 

— group 347, 353 

factorial 450, 576 

— function 450, 625f. 
factorization of algebraic sum 42 
— — natural number 24f., 46 
— — polynomial 101, 120f. 
factor of proportionality 37f. 
fall-line, gratuated 213 

false position, method of 638 
family of sets 321 

favourable event 579f. 
F-distribution 602f., 604 (table) 


Subject index 763 


feasibility of division 22 

— — subtraction 27 

feasible basic solution 655 

— directions, method of 663f. 

— region 655, 648 

Fermat’s conjecture 335, 672 

— theorem 670, 673 

Fibonacci numbers 341, 381, 641 

— search procedure 641 

fictitious activity 691 

field 342, 349-356, 475ff. 

—, cyclotomic 674 

— axioms 339, 349 

— of events, Borel 581 f. 

— — fractions 351 

15-gon 163 

figure (topology) 681 ff. 

—, iSOperimetric 698 

finite group 347f. 

— method 719 

— polyhedron 648 

finiteness, Dedekind’s definition 
326 

—, Russell’s definition 326 

Finsler geometry 573 

first boundary value problem 
697 

— fundamental form of surface 
566f. 

— variation 700 

fix (geodesy) 275 

fixed line in axial symmetry 152 

— point (geodesy) 258 

— — method 638 

— — theorem, Banach’s 711 

— — —, Brouwer’s 683, 685 

flag 714 

flat point 568 

flattened ellipsoid 544 

floating point representation 632 

focal distance 179f. 

focus 178 ff., 303 ff. 

folium of Descartes 434 

foot 147 

force, components of 246, 694 

form, bilinear 380 

—, definite 573 

—, homogeneous 675 

—, indefinite 573 

—, indeterminate 400f. 

—, multilinear 380 

—, prenex 337 

formalism in mathematics 718 

formalized theory 339f. 

forward section (surveying) 258 

foundations of geometry 711 ff. 

— — mathematics 12, 717ff. 

four colour problem 690 

Fourier series 497 ff. 

fourth harmonic point 171, 297 
555 

— proportional 39 

fraction 30-37, 45f. 

—, continued 76f. 


bd 


764 Subject index 


fraction, partial 129f. 

— of algebraic sums 44f. 

— — integers 30ff., 71 

— with variables 44ff. 

fractions, field of 351 

—, addition of 31f. 

—, multiplication of 32f. 

—, subtraction of 31f. 

fractional equation 87 

Frank-Wolfe, method of 663 

Fredholm alternative 704 

— integral equation 704 

free buffer time 692 

Frenet formulae 564f. 

frequency, angular 236f. 

— distribution 596 

— —, graphical representation 
of 597 

frequencies, comparison of 604 

frontal axonometry 215 

frustum of cone 195 

— — pyramid 194f. 

full angle 148 

function 107-139, 325f., 402ff., 
517ff. 

—, arithmetical 325 

—, Bessel 513 

—, bilinear 369 

—, calculable 340 

—, circular 133, 230ff- 

—, complex-valued 517 ff. 

—, composite 111 

—, conjugate 698 

—, continuity of 402f. 

—, convex 661 

—, cost 640 

—, cubic 118 

—, diagram of 108 

—, domain of definition 108f., 
520f. 

—, doubly-periodic 528 ff. 

—, equation of 109 

—, Euler 670 

—, even 112, 226 

—, explicit 109 ff., 431 

—, exponential 131f., 399, 418, 
451, 489f. 

—, extreme value 424 ff. 

—, factorial 450, 625f. 

—, graph of 108ff., 114f., 
131 ff., 225 ff., 432 ff. 

—, harmonic 696 

—, holomorphic 519ff. 

—, homogeneous 139 

—, hyperbolic 134 

—, implicit 110 

—, incidence 688 

—, infimum of 445 

—, influence 704 

—, initial recursive 340f. 

—, injective 326 

—, integral rational 350f. 

—, inverse 113f., 131 ff., 326 

—, invertible 113f., 131, 326 


function, irrational 115, 130ff. 

—, Lagrangian 702 

—, likelihood 600f., 615 

—, limit of 397 

—, linearization 615f. 

—, logarithmic 62, 133 

—, meromorphic 521 f. 

—, monotonic 111 

—, number-theoretical 325, 
340f., 672 

—, Objective 640, 654 

—, odd 112, 226 

—, one-to-one 326 

—, parametric representation 
110 

—, periodic 112, 225ff., 528 ff. 

—, §9-function 528f. 

—,-polynomial 120ff., 350f. 

—, potential 696, 698 

—, power 119, 125f. 

—, quadratic 116f. 

—, range of 107 

—, rational 115ff. 

—, recursive 340f. 

—, root 130f. 

—, separable 665 

—, supremum of 445 

—, symmetric 138f. 

—, trigonometric 133, 220ff. 

— curve 109ff., 431 ff., 437 ff., 

— for correlation 594 

— of several variables 136f., 
462 ff. 

functional 325, 709 

—, linear 709 

— analysis 705-711 

— determinant 423f., 464, 466 

— equations, method of 666f. 

—s, representation by 710 

—s as dual space 709 

functor 325, 333 ff. 

fundamental domain 81 

~—- sequence 75, 707 

— theorem of algebra 101, 528 

— — — Galois theory 678 


G 


gain, maximum 654 
gallon 187 

Galois group 352 

— theory 352ff., 678 
Galton’s board 588 
game, cooperative 730 
—, fair 724 

—,n-person 730 

—, saddle-point 724 

—, theory 723 ff. 

gamma function 450, 576 
Gauss, divergence theorem 478 


—, error integral 491, 591f., 614, 


625 
—, integral theorem 696 


Gauss, interpolation formula 629 

—,law of propagation of errors 
617 

—, principle of least squares 615 

—, theorem (descriptive geo- 
metry) 216 

— -Bonnet theorem 570 

— -Kriiger projection 255f. 

Gaussian curvature 568, 572 

— differential equation 513 

— distribution 584, 589f., 598 

— integral 591f. 

— law of errors 614 

Gauss’s algorithm 358f. 

--, integral theorem (potential 
function) 696 

— method to'solve a system of 
linear equations 646 

gcd (greatest common divisor) 
25, 46 

general exponential function 132 

— linear group 344, 372, 680 

— perspective affinity 209 

— structure 685 

— topology 686 

generalized Lagrange function 
661 

generating line of a cone 302 

generator of cone 193 

— — hyperbolic paraboloid 545 

— — prism 190f. 

— — one-sheet hyperboloid 546 

generic zero 677 

genus of surface 570 

geodesic curvature 568 

— line 261, 274, 567, 701 

— polygon 567f. 

geographical mile 273 

— North 256 

geoid 272 

geometric levelling (surveying) 
257 

— locus 177ff. 

— mean 106, 383 

— sequence 383 

— series 390 

geometry, absolute 713 

—, algebraic 675f. 

—, analytic 15, 282-319, 
530-547 

—, conformal 572 

—, descriptive 15, 203-220 

—, differential 15, 561-574 

—,elementary 15 

—, elliptic 717 

—, Euclidean 11, 146ff., 712f. 

—, foundations of 711-717 

—, hyperbolic 716 

—, integral 574 


—, Intrinsic 566 


—,non-Euclidean 716 
—, plane 146-183, 566 
—, projective 547-561 
—, Riemannian 573f. 


geometry, solid 184—203 

—, Finsler 573 

— of numbers 574 

— — sets 574 

— — webs 572 

GL(n), general linear group 344, 
374 


G. M. T. (Greenwich Mean 
Time) 281 

Godelization of language 721 

Gdédel’s incompleteness 
theorem 721 

Goldbach number 719 

—, conjecture 335, 673 

golden section 163, 171, 642 

grade (new degree) 149 

gradient 476f., 638f. 

— of line 116, 221, 287 ff. 

— method (steepest descent) 
643, 663 

graduation points 212 

graph 108 ff., 688 ff. 

— of frequency distribution 597 

— — function 108 ff., 114f., 
131] ff., 225 ff., 432 ff. 

— theory 688-692 

graphical integration 453 

— solution of cubic equation 
99f. 

— — — differential equation 
513f. 

— — — linear equation 91 

— — — quadratic equation 
O5f. 

great circle 198f., 261 ff. 

greatest common divisor 
(g. c. d.) 25, 46 

Greek alphabet 738 

Green’s theorem 696 

Greenwich Mean Time 
(G. M. T.) 281 

Gregory-Leibniz equation 492 

grid North 256f. 

group 13, 343ff. 

—, Abelian 342, 344f. 

—, abstract 346 

—, alternating 344 

—, axioms 339, 344 

—, commutative 344 

—, cyclic 345 

—, finite 347f. 

—, Galois 352ff. 

—, general linear 344, 372, 6&0 

—, homology 683 

—, homotopy 683 

—, Klein 346 

—, Lie 348, 680 

—, orthogonal 373 

—, permutation 347f. 

—, special linear 344 

—, symmetric 344, 354 

—, topological 348, 680 

— of motions 713 

— — prime residue classes 670 


group representation by matrix 
680 

growth factor 141 f. 

— function 132 

guide curve 190, 193 


H 


ha, hectare (unit of area) 164 

half-angle formulae 245, 265 

— -life 60 

—-line 147 

—-plane 152, 290f., 714 

— -side formulae 265 

Hamilton circuit 690 

— -Jacobi differential equation 
693 


Hamiltonian of a partial differ- 


ential equation 693 
Hamilton's principle 702 
— quaternions 6&0 
harmonic analysis 499 
— division 171, 297, 555 
— function 696 
— mean 106 
— points 171, 297, 550, 555 
— series 391 
— synthesis 499 


Hausdorff separation axiom 686 
heat conduction, equation of 696 


hectare, ha (unit of area) 164 

height, determination of 243, 
253f. 

—-line of a plane 213 

Helmholtz oscillation equation 
696 

Hermitian transformation 372 

Heron triangle 165 

Heron’s formula 165, 249 

Hessian normal form 290f., 
540f. 

hexagon, regular 162, 249 

—, Brianchon’s 561 

—, Pascal’s 561 

hexahedron (cube) 196f. 

hierarchy 733 

higher arithmetic 669 

— derivative 409f., 421 

Hilbert algebra 6&0 

— space 707 

—"s axioms of plane geometry 
712 

Hippocrates, lunulae of 175 

hole in figure 681 

holomorphic function 519 ff., 522 

homeomorphic figures 681 

homeomorphism 681 f. 

homogeneous coordinates 531 f., 
549 ff. 

— differential equation 506 

— function 139 

— ideal 676 

— polynomial 675 


Subject index 765 


homogeneous projective coordi- 
nates 549 ff. 

— system 358 

homology group 683 

homomorphic image 345 

homomorphism 345 ff. 

homotopy group 683 

horizon (descriptive geometry) 
217f. 

horizontal system 276f. 

L*Hospital, rule of 400f. 

hour angle system 277 

hour circle 277 

L*Huilier’s formula 266 

Hungarian method (opti- 
mization) 658 

hyperbola 181 f., 303 ff. 

—, asymptotic equation 308 

—, central equation 307 

—, degenerate 182 

—, equilateral 182, 287, 45] 

—, leading circle of 182 

—, linear eccentricity 181 f., 
307f. 

—, normal to 311 

—, Numerical eccentricity 304, 
309 

—, polar equation 316 

—, tangent to 182, 309f. 

—, vertex of 181 f., 307f. 

hyperbolic cosine 134 

— cotangent 134 

— differential geometry 572 

— function 134 

— —. derivative of 419 

— -—, power series 489 

— geometry 716 

— paraboloid 545f., 569 

— point 568 

— sine 134 

— tangent 134 

hyperboloid, generator of 546 

—, one-sheet 202, 546 

—,two-sheet 547 

— of revolution 185, 547 

— — rotation 185, 547 

hypergeometric series 513 

hyperplane 644, 710 

hypocycloid 440 

hypotenuse 155 

hypotheses, testing of (statistics) 
601 ff. 


hypothesis, continuum 722 


I 


i (imaginary unit) 12, 78 
icosahedron 196 

ideal 669, 676 

— class 674 

— die 579 

— theory 670, 674 
ideals, congruence of 670 


766 Subject index 


ideals, equivalence of 674 

—, intersection of 676 

idempotence of set operations 
322 

identity matrix 374 

— theorem for power series 
484 

image, descriptive geometry 
203 ff. 

—, homomorphic 345 

—, spherical 570 

— of function 107 

— — linear map 370f. 

— — mapping 325 

— plane 217 

imaginary part 78, 518f. 

— uniti 78 

implicit definition of curve 562 

— differentiation 417 

— function 110 

— —, derivative of 417 

— representation of curve 562 

— — — surface 565 

impossible event 578 f. 

improper fraction 30 

— integral 449f. 

— line 548 

— plane 548 

— point 204, 287, 302, 531, 548 

— quadric 543 

inaccessible point, construction 
of 158, 248 

incentre of triangle 159 

inch 147 

incidence, axioms of 713 

— function 688 

— of point and line 292 

incircle of triangle 159, 248 

incompatible events 579 

incompleteness of axiomatic 
theories 720f. 

— theorem, Godel’s 721 

inconsistent equation 82, 90 

— inequality 104 

increasing divided differences 
628, 634 

indefinability of truth 720 

indefinite form 573 

— integral 454ff. 

indenting method (axono- 
metry) 215 

independence, linear 367 

independent buffer time 692 

— events 581 

— variable 109f. 

indeterminacy interval 640 

indeterminate form 400f. 

index of subgroup 348 


indirect conformal mapping 527 


— observation, adjustment of 
621 

indirectly proportional 37 

induction, mathematical 70 

—, transfinite 331 


inequalities, equivalent 104f. 

inequality, algebraic 103f. 

—, Bernoulli 106, 387 

—, Bunyakovskii 368 

—, Cauchy’s 708 

—, Cauchy-Schwarz 106, 368f. 

—, Chebyshev 585f. 

—, consistent 104f. 

—, inconsistent 104 

—, linear 648f. 

—, Minkowski 708 

—, Schwarz 708 

—, solution set of 103f. 

—, solving method 104f. 

—, universally valid 104 

inference, logical 333f. 

—, mathematical 337f. 

—,rules of 334, 337f. 

infimum of function 445 

infinite, actual 719 

— cardinal number 327 

— decimal fraction 34, 74 

— interval of integration 450 

— set 321 ff. 

infinitely distant point 287 

— many primes 24 

infinitistic method 719 

inflection, point of 425f., 431f. 

influence function 704 

inhomogeneous system 358 f. 

initial condition, differential 
equation 514 

— error 610 

— ordinal number 330 

— value 514 

injective function 326 

inner division of segment /70, 
204, 286f., 536 

— product 365f., 368f. 

inscribed circle 159, 248 

— polygon 162f. 

instrumental error 613 

integer 27, 73f., 669, 670 

integrability condition 471, 508f. 

integral, binomial 462 

—, definite 444 ff. 

—, derivative of 420 

—, elliptic 462, 528f. 

—, improper 449f. 

—, indefinite 454 ff. 

—, logarithmic 492 

—, multiple 465f. 

—, particular 501 

—, principal value 522 

—, recurrence formula 455f. 

—, singular 501 f. 

—, triple 465 

—, two-dimensional 463f. 

— basis 674 

calculus 443-479, 518, 687 

curvature 570 

of differential equation 501 

domain 349f. 

— equation 703f., 731 


integral geometry 574 

— logarithm 673 

— optimization 659 

— rational function 350f. 

— theorems of Gauss, Green, 
and Stokes 478, 696 

—s, standard (table) 454 

integrand 445 ff. 

— with infinity 449 

integrating factor 509 

integration, graphical 453 

—, limit of 445 

—, numerical 635f. 

—, partial fraction 459f. 

—, several variables 462 ff. 

—,term-by-term 481 

—, trapezoidel rule 636 

— as inverse to differentiation 
447f. 

— by parts 455f. 

— by power series 512 

— of power series 481 

— by substitution 457f. 

intercept chart 651] 

— equation of line 289 

— — — plane 540 

— theorems 169f. 

interest 140ff. 

— factor 142 

interior angle 151, 155f., 159f. 

— point (topology) 684 

intermediate field 349 ff. 

internal common tangent 173 

— division of segment 170f., 
204, 286f., 536 

interpolated half-steps 514 

interpolation, linear 51, 60ff., 
232, 383, 626 

— formula, Adam’s 637 

— —, Aitken’s 635 

— —, Gauss’s 629 

— —, Taylor’s 633 

— polynomial 626 

— —, Lagrange’s 627, 633 

— —, Newton’s 627, 633 

— quadrature formulae 636 

— in sequence 383 

interpreter 750 

intersection of circles 302 

— — conics 313f. 

— — ideals 676 

— — line and cone (descriptic 
geometry) 211 

— — — and conic 309ff. 

— — lines 206, 293f., 539 

— method (descriptive geo- 
metry) 217 

— of planes 54If. 

— — sets 322f. 

interval calculus 633 

— estimation 600f. 

— of indeterminacy 640 

intrinsic geometry 566 

intuitionism 719 


invariance of cross-ratio 550f. 

invariant subgroup 347 

~- theory 572 

— under bending 568 

—-s of curve 564 

inverse function 113f., 131 ff., 326 

— —, continuity of 405 

— -, derivative of 415 

-- —, graph of 114f., 131 ff. 

~~ hyperbolic function 134f., 
419, 49] 

— image 325, 370f. 

--- map 681 

— matrix 375f. 

— transformation 284f., 372, 

376f. 

trigonometric function 113f. 

229 ff., 419, 491 

inversely congruent 156 

— proportional 37 

inversion of ordering 576 

— problem for (-function 529 

— theorem for power series 487 

invertible function 113f., 131, 
326 

involute 437 

irrational function 130ff. 

— number 56, 59, 77 

irreducible polynomial 351, 673 

irreflexive relation 323 

irrotational field 478 

isocline 504 

isolated ordinal 331 

— point 434 

— singularity 521 

isometric axonometry 215 

— mapping 566, 569 

— surfaces 566 

isometry 215, 566 

isomorphic systems 72f. 

isomorphism 75, 346 

—, dual 678 

isoperimetric figure 698 

— problem 574, 698 

isosceles spherical triangle 272 

— trapezium 161 

— triangle 155f., 249, 272 

iteration 639, 643, 646 

— kernel 705 


J 


Jacobian 424, 464, 466 
Jordan-Brouwer theorem 685 
— curve theorem 682 

—’s elimination 644 

jump discontinuity 224, 403 


K 


Kepler’s rule 202 
kernel, integral equation 704f. 
— of homomorphism 346 


kernel of linear map 370 

kite 160f., 166 

Klein four group 346 

—’s Erlangen programme 572 
knot 273 

Kochansky, construction of 191 
Kolmogorov axioms system 582 
Kuhn-Tucker condition 662 
Kuratowski-Zorn lemma 324 


L 


l, litre 187 

Lagrange function 66] 

—’*s form of remainder 488 

—’s interpolation polynomial 
627, 633 

—’s polynomial 627 

—"s theorem on four squares 
673 

—"s — — groups 348 

Lagrangian equations of motion 
702 

— multiplier 430f., 620 

language, arithmetization of 721 

—, arity of symbol 334 

--, assembly 749 

— , codification of 721 

—, elementary 335, 342 

— , Gédelization of 721 

—, machine 749 

— of predicate logic 335 

Laplace differential equation 
523, 695 

— operator 479, 695 

large number research 605 

— —s, law of 586 

latitude (sphere) 262 

— (surveying) 256 

lattice 678 

— of periods 528f. 

— of point (descriptive 
geometry) 206 

Lasker-Noether theorem 676 

Laurent expansion 521 

layer, spherical 198f. 

law of large numbers 586 

Icd, least common denominator 
32 

Icm, least common multiple 25 

leading circle of conic 18f. 

least common denominator 32, 
45 

— squares, method of 615, 620, 
626 

Lebesgue measure 687 

left-handed coordinate system 
223, 283, 530 

— unique relation 323 

Legendre symbol 671 

—’s form of elliptic integral 529 

-~*s theorem (spherical triangle) 
266 


Subject index 767 


Leibniz’ test for convergence 
393f. 

lemniscate 438 

length, units of 147 

— of curve 469, 563 

— — segment 147, 286, 536 

— — vector 363 

— — word 632 

level curve 475 

— surface 475 

— (surveying) 257 

levelling, tacheometrical 253f. 

lexicographical order 41, 576 

Ig (decadic logarithm) 58 ff. 

Lie group 348, 680 

life expectancy 145 

— insurance 145 

likelihood, maximum 600, 615 

limb (surveying) 253 

limit in abstract space 705 

— of function 397 

— — integration 445 

— number 331 

— of sequence 384 

— — series 389, 481 

— theorem, central 593 

limiting error 610, 612 

Lindemann’s theorem on z 675 

line 146, 184f. 

—, equation of 287ff., 537f. 

—, equation in projective 
coordinates 553 

—, gradient of 116, 287ff. 

—, Improper 548 

—, oriented 290 

—, Plucker coordinates of 553 

—, principal (descriptive 
geometry) 206f. 

— of curvature 571 

— diagram (statistics) 597 

— integral 463, 470 

—s, bundle of 539 

—S, geodesic 701 

—S, orthogonal (perpendi- 
cular) 148, 153f., 294 

—S, parallel 148f., 154 

—S, Skew 185, 206, 213, 538f. 

linear algebra 356ff. 

— combination 367f., 655 

dependence 90f., 367 

differential equation 507 ff. 

— eccentricity 179ff., 305 ff. 

equation 86ff., 356ff., 644 ff. 

function 115f. 

— functional 709 

— inequality 648f. 

— integral equation 703f. 

interpolation 51, €0ff., 232, 

383, 626 

— map 370f., 375f. 

— operator 709 

— Optimization 654-661 

— space 705 

term 86, 92, 97 


768 Subject index 


linear transformation 371 f.. 
376f. 
linearly ordered set 329 
linearization 6/5f. 
Lipschitz condition 514, 711 
litre, | 187 
In (natural logarithm) 58 ff, 
local extremum 425, 430, 433 
— time 281 
locus, geometric 177 ff. 
logarithm S6ff., 71 
, power series 490 
— of negative number 62, 231 
— table 60ff., 231, 728 ff. 
logarithmic differentiation 415 
— function 62, 133, 404, 418, 
490 
— integral 492 
— scale 66 
— spiral 417, 443 
— system 456f. 
logic algebraic 733 
—, arithmetic 733 
—, propositional 332f, 
logical equivalence 337 
— interference 333f. 
logicism 718 
loop of graph 688 
lower bound 610 
— sum 445 
loxodrome 274f. 
lunulae of Hippocrates 175 


M 


m, metre 147 
Maclaurin series 489 ff. 
inagnetic North 256 
main diagonal of matrix 374 
major axis of ellipse 179f., 705f. 
— theorem of Desargues 715 
— — — Pappus 775 
majorant criterion, Weierstrass’ 
480f, 
manifold, differentiable 348, 
573f. 
mantissa 58 ff, 
many-valued function 79 
map, mapping 107f., 325f., 681 
. continuous 684 
. homeomorphic 682 
—, inverse 68] 
—, linear 370f., 375f. 
—, normal 690 
—, nullity of 370 
—, rank of 370 
mapping, conformal 524 ff. 
, sometric 566, 569 
~~, perspective 209 
—, projective 551 ff. 
—, Surjective 325 


mapping theorem, Riemenn's 
525, 528 
—s, composition of 326, 37/ 
Markov chain 594 
— process 594 
— property 665 
Mascheroni's constant 492 
mathematical geography 272 ff. 
— induction 70 
— logic 13, 332-342 
— object 705 
— operation 13, 705 
— optimization 653-668 
pendulum 510 
— Platonism 718 
— statistics 595 ff. 
theory 334 
mathematics, applied 14 
, foundations of 717-722 
~, Numerical 630 
—, pure 14 
matrix 373 ff. 
—, characteristic polynomial 
379 
—, contragredient 374 
—, inverse 375f. 
—, payoff 723 
—, frank 377 
—, regular 374 
—, similar 376 
—, Singular 374 
—, symmetric 377 
—, transposed 374 
— game 723 ff. 
— representation of group 680 
maximum, local 424f, 
— expectation 661 
— gain 654 
likelihood 600, 615 
mean, arithmetic 106, 382 
—~, geometric 106, 383 
, harmonic 106 
—, weighted 619 
— curvature 572 
— error 616f. 
— solar time 280 
—-Square deviation 597 
sun 280 
value 617 
of distributions 587 ff. 
~ random variable 583f, 
- sample 597 
theorem (differential 
calculus) 41] 

(integral calculus) 447 
measurable point set 687 
measure, Lebesgue 687 

— precision 616 
theory 687 
zero 687 
measured value 615, 631 f, 
measurement, direct 618 
, error of 613 
—, variance of 614 


measurement, weight of 616 
— of area 164f., 174f., 295f. 
— (descriptive geometry) 218 
measuring point method 218 
median 158 
— value of sample 597 
memory 738f. 
—, addressable 741 
—,Trandom-access 746 
—, read-only 746 
Menelaus, theorem of 298 
meridian 255, 273, 276ff. 
quadrant 273 
of solid of rotation 201, 57/ 
— — surface of revolution 
201, 571 
— strip 255, 262 
meromorphic function 521 f. 
meta-language 72] 
metra-potential method (MPM) 
692 
metre, m 147 
metric, axioms 706 
— space 685, 686, 706f. 
metrization theorem 686 
microcomputer 745 ff. 
middle crystal 197 
— proportional 38 
mid-line of trapezium 161, 165 
— -parallel 177 
—-point 153 
mile 147, 273 
military perspective 215 
milliard 19 
million 19 
minimal surface 572, 700 
tree (graph theory) 689 
minimax strategy 620 
minimum, local 424f. 
— matrix procedure 659 
Minkowski inequality 708 
minor axis of ellipse 179f., 705F. 
— of determinant 36] 
— theorem of Desargues 715 
— — — Pappus 715 
minuend 2] 
minus 21, 27 
minute (unit of angle) 149 
— (unit of time) 281 
mirror image 152, Plate 54 
mixed derivative 42] 
mixing problem (optimization) 
660 
mm, millimetre 147 
Mobius strip 682 
model 13° 
— of set of expressions 336 
modulus of complex number 
78, 707 
— logarithmic system 58f, 
— — vector 363 
moment of inertia 474 
monic polynomial 83, 351 
monomial 40 


Subject index 769 


monotonic decreasing 11] neighbourhood of point 683 number, cardinal 17f., 327 
monotonic function 11) Neil's parabola 451 —, complex 13, 77f., 517ff. 

- increasing 111 nest of intervals 75 ~, divisiblity 669 

— law 20, 22, 28, 71 net of solid 187f. —, irrational 56, 59, 77 

sequence 382 network techniques 690f. —, Natural 17ff., 69f., 74, 719 

Monte-Carlo-method 652 Netinann series 705 —, Negative 12, 27f, 
morphism 325 neutral element of operation ~, Opposite 27, 40 
mortality table 145 443 —, ordinal 18, 329f. 
most probable estimation 615 never convergent 482 —, positive 27ff., 73 
motion, axioms of 714 new degree (grade) 149 —, Pythagorean 672 


—, Lagrangian equations of 702 _ point (surveying) 258f. —, rational 30ff., 71f. 


—s, group of 713 ; ae -, real 12, 13, 74F. 
moving trihedron of curve ee ee oe -, Signed 28f., 73 


563f. ot de be. Pant, —, transcendental 675 
MPM (metra-potential method) N®WtOn's mtctpowation Poly _ eid 351, 673f. 
M Sieg 480 — method with gradient 638f. — '!"e sate 
eae ~ of multi-dimensional eee 


multi-dimensional search — sphere, Riemann’s 526 

process 642 search 643 — system I7ff., 69ff., 72f. 
—- — — —, Newton's 643 — -theoretical function 325, 
multilinear algebra 3£0 340f., 672 


polynomial 627f. 
n-gon, regular convex 162, 249 


form 380 cake jae iba — theory 23ff., 46, 669 ff. 
tiple ; | 465f node of graph 688 dditive 673 
mu lip e integra i = plane 305 ——, & itive 


— —, analytic 672f. 
multiplication, abbreviated 36 —§, geometry of 574 

— of algebraic sums 42f. function 3334 numeral 18 

— — fractions 32f. Spe end Sh ek numerator 30ff., 44f., 76 


— — integers 22f. — linear equation 102f., 6426. numerical analysis 630ff. 
—-— relation 623f. 


multiplicand 22 nomogram 53, 64977. 


non-Euclidean geometry 716 


— — matrices 373 — differentiation 635f. 
— — order types 330 — “residue 671 — eccentricity 303 ff. 
— — powers 49f, norm of vector 363f. integration 635f. 
— — series 396 = space /07 | solution of ordinary 
— — vector by scalar 364 normal axonometry 216 differential equations 637 
— law for probabilities 580 — to circle 300 — variable 401. 
multiplicative semigroup 679 — to conic 311 
multiplicator 22 = Reta ae 0 
a Gt algebraic variety distribarion 584, 38941. , 399, 
— — zero I2If. 602, 614, 591 (table) obelisk 203 
multiplier, Lagrangian 430f.. — form of algebraic equation — Gbject, mathematical 705 
620, 701 83, 92, 97, 100F. — language 721 
mutually exclusive events 579 ~ —, Cartesian 289 objective function (optimiza- 
— map 690 i tion) 640, 654 
N plane to curve 562 f. oblique coordinates 283, 531 
~ projection 205 ff. — cylinder 190 
N, natural number 17/7, 69f., 74 — Subgroup 346f. ~ prism 190 
nabla operator 479 — to surface 566 projection 204f. 
nadir 276 — vector 290, 541f. - pyramid 193 
Napier's analogies 266ff. normalized equation of line 291 observation 613, 619f., 621 
rules 270ff, - Gaussian distribution 590f. obtuse angle 148 
natural equation of curve 564 North-West corner rule — triangle 155 
— logarithm 57f. (optimization) 659 octahedron 196, 244 
— number 17ff., 69f., 74,719 notation, reverse Polish 733 octant 530 
— — factorization of 24f,. n-tuple 136, 325, 684 odd function 112, 226 
- parameter of curve 563f. null 18 one-plane method (descriptive 
nautical mile 273 — graph 689 geometry) 212 ff. 
— triangle 278, 282 — hypothesis (statistics) 601f. — -sheet hyperboloid 202, 546 
n-dimensional convex poly- — matrix 373 — -sided limit 397 
hedron 648 sequence 385 —-— continuity 402 ff. 
— space 684f. — vector 364 — -to-one function 326 
needle problem of Buffon 574 ~— nullity of linear map 370 open set 684 
negative curvature 117, 434f. number, absolute rational 71f., operation, arithmetical 21, 48, 
— number 12, 27f. 74 SOff., 71 


— sign 28f. —, algebraic 673f. —, set-theoretical 322f. 


770 Subject index 


operation defined on set 343 

- im vector space 362ff. 

, bilinear 365 

operational symbol 28 

operator 708 ff. 

—, linear 370, 375 ff., 709 

—, Laplace 479, 695 

opposite angle 150 

— element in group 344 

— number 27, 40 

-— points of sphere 262 

— side in trigonometry 222 

— vector 364 

optimal strategy 664 

optimality, Bellman’s principle 
of 665 

optimization 653-668 

—, convex 661 

—, dynamic 664-668 

~—, linear 654-661 

—, integral 659 

—, parametric 659f. 

--, quadratic 662f. 

—, stochastic 661 

order, Archimedean 72 

—, axioms of 713 

—, inversion of 576 

--, lexiographic 41, 576 

—, total 72 

— of combination 577 

— — cyclic group 348 

-. — differential equation £01, 
693 

— — element of group 348 

— line (descriptive geometry) 
205 ff. 

- in N 20, 71 

- — @ 3], 7I ff. 

— — R75 

-- type 329f. 

-- in Z 27 

ordered pair 73, 108f., 323f. 

ordinal number 18, 329 ff. 

— —, initial 330 

— —, isolated 331 

ordinary cusp 434 

— differential equation 
500-517, 637 

ordinate 283 ff. 

Orientation 530f. 

oriented distance 290, 536, 541 

— line 290, 535f. 

origin of coordinates 283f., 
530 ff. 

Original element 107f. 

orthocentre 158 

orthogonal group 373 

— lines 148, 153, 369 

— perspective affinity 208 f. 

— projection 205 ff. 

— trajectories 437, 504 

— transformation 372 

orthogonality in hyperbolic 
geometry 716 


orthonormal basis 369, 624 

— system of functions 624 

oscillation, damped 236 

—- equation of Helmholtz 696 

osculating parabola 494 

— plane of curve 562f. 

Ostrogradsk1i differential 
equation 700 

outer point of division 287, 536 

overall strategy 640f. 

ovoid 574 


P 


packing, theory of 574 

Padoa's principle 340 

paper-strip construction 179f. 

Pappus, theorem of 554, 715 

Pappus’ rules 201, 473f. 

parabola 95f., 116ff., 183, 
303 ff., 309 ff., 314 ff. 

—, cubical 118 

—, Neil's 451 

—, osculating 494 

—, semicubical, see — Neil's 451 

parabolic approximation 628 

— point 568 

— spherical ring 468 

paraboloid, elliptic 544f. 

—, hyperbolic 545f. 

— of revolution 202, 544 

— — rotation 202, 544 

paradox, Russell’s 32] 

—,Skolem’s 720 

parallel axiom (postulate) 12, 
712 

— circle 201 

— coordinates 283f., 531 

— displacement 285, 533, 567f. 

— lines 148f., 154, 716 

— planes 185 

— projection 204f. 

— rotation 207 

parallelepiped 190 

parallelogram 160, 165, 168 

— of forces 251 

parameter 82, 87, 110, 314, 507 
539f., 561 ff. 

—, natural 563f. 

—, variation of 507 

— curve 565 

— of conic 314 

— transformation 562, 565 

parametric optimization 659 

— problem (optimization) 654 

— representation of circle 135, 
300 

curve 561 f. 

ellipse 307, 482 

function 110 

hyperbola 135 

plane 539f. 

surface 565 


partial derivative 420f. 

— differential equation 693 

— fraction 129f., 459f. 

— sum 389 

partially ordered set 678 

particular integral 501 

partition of number 673 

— set 324 

Pascal's hexagon 561 

— theorem 560f. 

— triangle 43 

path (graph theory) 689 ff. 

—-connected 683 

Peano-Jordan content 687 

Peano’s axioms 13, 70, 335 

w-function 528f. 

Pell’s equation 672 

pencil of lines 147, 185, 539 

— — planes 185, 542 

pentagon, regular 163 

percentage 139 ff. 

period parallelogram 528f. 

periodic fraction 34f., 77 

— function 112, 225ff., 528 

periodicity of trigonometric 
functions 225f. 

permutation 343f., 360, 576 

— group 347f. 

perpendicular bisector 153, 157, 
177 

— lines 148, 153f., 185, 294, 
366 

personal error 613 

perspective 169, 216ff. 

— affinity 208 ff. 

— collineation 218f. 

— image 203f. 

perspectivity 547f. 

PERT (programme evaluation 
and research task) 692 

perturbation function S511f. 

— theory 731f. 

phase difference 236 

— of polar coordinates 284 

photogrammetry 216 

m 12, 492f., 675 

Picard’s theorem 528 

plan (descriptive geometry) 
205 ff. 

plane 184 

—, equation of 539f. 

—, Eudidean 548 

—,improper 548 

—, node of (descriptive 
geometry) 206 

—, parametric representation 
539f. 

—, projective 548 

—, vanishing 204 

— curve 109ff., 431 ff., 437ff., 
448 ff., 469 

— geometry 146-183, 566 

— of projection 203 ff. 

— — support (optimization) 655 


plane of symmetry 206 

-—— trigonometry 241-261 
—s, pencil of 185, 542 
planar body 186f. 

Platonic bodies 196 
Platonism, mathematical 718 
Plucker line coordinates 553 
plus 20 

pocket calculator 732ff. 
Pohlke’s theorem 214 
Pohlke trihedron 214f. 
point 146, 684 

—, asymptotic 443 

—, elliptic 568 

—, hyperbolic 568 


, Improper 204, 287, 302, 531, 


548 ff. 
—, parabolic 568 
-—-, regular 562, 566 
—., singular 562, 566 
—, vanishing 204 
— -direction form of line 288 
— of inflection 425f., 431 f. 
— set (topology) 680ff. 
-~- of sight (descriptive 
geometry) 217 
Poisson distribution 588f. 
— equation 695 
— process 594 
polar circle 262 


— of conic 311 ff., 557f., 716 


— coordinates 284f., 466, 531 f. 


— equation of conic 316 

-- moment of inertia 474 

— solid angle 186, 263f. 
triangle 263f. 

polarity 557, 716 

pole of conic 312f., 557 

— — function 126f., 521 ff. 

— — sphere 262, 277f. 

polygon 162f., 296, 567f. 

polygonal arc 260, 513f. 

polyhedron 186f., 196f., 574, 
644, 648, 655 

polynomial, polynomial func- 
tion 120ff., 350f., 404, 413 

—, characteristic 379 

—, irreducible 351, 673 

—, reducible /20, 673 

— ideal 676 

—s, space of 710 

pontoon 203 

population (statistics) 595 

position plane (descriptive 
geometry) 217f. 

— system 19 

— vector 476 

positive curvature 117, 434f. 

— definite form 573 

— direction 283, 290 

— number 27ff., 73 

— sign 28f. 

possible event 579 


postulate see parallel axiom 12, 


712 


potential 476f., 523, 693 ff. 


— energy 702 
power 48 ff. 
-— in C 79 
— function 119ff., 
-— —, derivative of 417f. 
— of ‘number 48 ff., 79 
— residue 671 
— series 482 ff., 520f., 527 
— set 321, 327 
— of trigonometric function 
235 
precision 616, 618 
predecessor 20, 27 
predicate 82, 103 
— logic 334ff. 
pre-image 370 
premisse 334 
prenex form 337 
present value of annuity 143f. 
pretzel, genus of 570 


primal problem of optimization 


656 
primary ideal 676 
prime ideal 676f. 
— number 24f., 46 
— — theorem 672f. 


— residue classes, group of 670 


— twins 24 

primitive of function 448, 454, 
520 

— recursion 34] 

— root 670 

principal 140f. 

— axes of quadric 543 ff. 

— axis of conic 286, 303f., 
379f. 

— curvature 571 f. 

— ideal 669, 676 

— lines (descriptive geometry) 
206f. 

— normal 563f. 

— part of Laurent expansion 
52] 

— point (descriptive geometry) 
204, 217f. 

— projection (descriptive geo- 
metry) 212 

— radius of curvature of 
ellipse 180 

— value 79, 114, 230f., 522 

-—- vanishing point (descriptive 
geometry) 204f. 

— vertex of ellipse 179f. 


principle of permanence 50, 71 f. 


prism 190 

prismoid 202f. 
probable error 616 
probability of event 687 
— paper 598 

— theory 578-594 


procedure, nomographical 649f. 


125 ff., 417f. 


Subject index 771 


procedure, numerical 631 

-— of minimum matrix 
(optimization) 659 

process (optimization) 664f. 

— (probability theory) 594 

—, Stationary 594, 668 

—, stochastic 594 

product, Cartesian 325, 328 

—, derivative of 413f. 

— of cardinal numbers 328 

— — events 579 

— formulae of trigonometric 
functions 235 

— of mappings 326, 371 

—- ~~ power series 485f. 

— — vectors 365 ff. 

— representation /0/, 121, 351 

— of trigonometric functions 
235 

profile 201 

programme evaluation and 
research task (PERT) 692 

projecting plane (descriptive 
geometry) 207 

projection (descriptive geo- 
metry) 203-220 

— (surveying) 255 

— with heights 212f. 

projective coordinates 548f., 
556f. 

— differential geometry 572 

— geometry 547-56] 

— mapping 55] ff. 

— plane 548 
space 548 

amar 11, 334, 337ff. 

propagation of errors 617 

proper fraction 30ff. 

— quadric 544f. 

proportion 37 ff. 

—, exchange theorem 38 

proposition 40, 82, 103, 335 

propositional calculus 333 

— logic 332f. 

protractor 149 

pseudo-random numbers 653 

pseudosphere 201, Plate 56 

punctured disc 521 

pure mathematics 14 

pyramid 192f. 

Pythagoras’ theorem 166 

Pythagorean number 672 


Q 


Q, rational number 30ff., 71 f. 

quadrangle 160f. 

—, complete 554 

quadrant relations for 
trigonometric function 228f. 

quadratic equation 92ff., 103, 
318f. 

— function 116f. 


772 Subject index 


quadratic irrationality 77 

— number field 674 

optimization 662 

-—- polynomial 120f. 

— residue 671 

— supplement 93 

— term 92, 97 

quadrature (area) 444, 450ff. 

-— (differential equation) 505 

—- (numerical integration) 635f. 

quadric 543 ff. 

quadrilateral 159ff., 176, 250 

— , area of 250 

—, complete 554 

~ , cyclic 176, 250 

— of chords 176 

— of tangents 176 

quadrillion 19 

quality control €05f. 

quantifier 334f. 

quarter turn theorem for 
trigonometric functions 228 

quartic equation 100 

quaternion, Hamilton’s 680 

quotient 22 

—, derivative of 414 


R 


R, real number 12f., 74f. 

rad, radian (unit of angle) 149f. 

radial distance 178, 181 

— symmetry 153 

radian, rad (unit of angle) 149 

radical 102, 354 

radicand 52 

radius 171 ff., 198, 284, 299 ff., 
435 

--, spherical 262, 269 

-- of convergence 483, 527 

— — curvature 435 

RAM 746 

ramphoid cusp 434 

random error 613f. 

-- event 578, 582f. 

— number 595, 653 

-- process 594 

- sample 595 

~- variable 582f. 

range of function 107, 225f. 

— — relation 323 

— -— trigonometric function 
225f 

-~- — variation 597 

rank of linear map 370 

- — matrix 370, 377, 648 

rate of interest 141 

ratio of division 204f., 286f., 
536, 550 

— — numbers 38 

— — similarity 168 

— test for series 392 

rational function 115, 126ff., 404 


rational function, integration 
459 ff. 

-- number 39ff., 71 f. 

rationalizing denominator 55 


ray 147 
real number 12f., 74f. 
— part 78, S518f. 


rearrangement of series 394f. 

reciprocal fraction 30, 32 

— radii, transformation by 527 

reciprocity law 671 

rectangle 160, 164 

rectangular coordinates 283,530 

— hyperbola 45] 

— impulse 498 

rectifiable curve 469 

rectifying plane 563f. 

recurrence formulae 455f. 

recursion 331, 341 

recursive function 340f. 

— —, generating method 341 f. 

recursively enumerable set 341 

reduced form 98, 100 

— fraction 30f. 

reducible ideal 676 

— polynomial /20, 673 

reduction factor (descriptive 
geometry) 216 

reflection 152, 714 

-~ Ina point 152 

reflex angle 148 

— polygon 162 

— quadrilateral 160 

reflexive relation 323 

region, feasible 655 

register 733 ff. 

regression 599f., 623 

regular matrix 374 

— point 562, 566 

— polygon 162ff., 249f. 

— polyhedron 196 

— pyramid 193 

relation 323f. 

relations, adjustment of 621 ff. 

— between angles and sides of 
triangle 155, 267 

— between trigonometric 
functions 227 

relative curvature 57] 

— error 608 

relatively prime 25 

relativity, theory of 573 

reliable digit 36, 609 

remainder on division 23, 44f., 
670 

— of Taylor series 488 

removable discontinuity 403 

— singularity 521 

representation, asymptotic 625f. 

-— of functional 709f. 

—- -~ number 19, 632 

—- theory (groups) 680 

representative of vector 363f. 

— sample 595 


residue, quadratic 671 

— class ring 351, 670 

-~ Of meromorphic function 
521 

— theorem 522f. 

resolvent 704 

resources, calculation of 
(network techniques) 692 

rhomboid 160 

rhombus 160 

Riemann content 687 

~ mapping theorem 525, 528 

— number sphere 526 

— surface 527 

~— zeta-function 673 

Riemannian geometry 573f. 

Riesz representation theorem 
710 

right angle 148, 242 

right angled triangle 155, 
166f., 241f., 270f. 

~- ascension 278 

— circular conoid 202 

— cylinder 190 

—--handed coordinate system 
223, 283, 366, 530 ff. 

— prism 190 

— unique relation 323 

— value (surveying) 256 

ring 669f., 679 

Ritz method 702 

Rolle’s theorem 411 

ROM 746 

Roman number system 19 

root, primitive 670 

— (zero) of equation 101 f., 
673 f. 

-- field 351 f. 

-- function 130f. 

— of number 47, 51 ff., 68, 71, 
79 

— —- polynomial 122f., 351 

-—— — unity 79, 97, 674 

— test 393 

— theorem of Vieta 102, 138 

~ §, determination of (numeri- 
cal analysis) 638 

rotation 152f., 285f., 372, 478, 
533f. 

-- of coordinate system 285f., 
533f. 

— -— vector field 478 

rounding 609 

— error 630 

row of matrix 360, 373 

rule, Bernoulli’ and L’Hospital’s 
400f. 

— of inference 334, 337f. 

ruled surface 545, 547 

ruler and compass, construction 
151, 154, 163, 355f. 

Runge-Kutta method 637 

runner 66 

Russell’s paradox 321 


Russel’s definition of finiteness 
326 
Rytz construction 180 


S 


saddle point 429, 545 

-- — theorem 661 f. 

— surface 569 

salinon 175 

sample 595f. 

sampling plan 606 

Sarrus’ rule 360 

sawtooth curve 498 

scalar 362f. 

— field 475 

— function 476f. 

— product in Hilbert space 707 

scale (NOmography) 649f. 

scatter of variation 597 

Schwarz’ inequality 708 

—, theorem 421 

score 19 

sea level 257 

search procedures 640ff. 

secant 171f., 198, 222 

— method 638 

— theorem 176 

second (unit of angle) 149 

— (unit of time) 281 

— boundary value problem 697 

— equatorial system 278 

— fundamental form of surface 
568 

secondary vertex of ellipse 305 

sector 171, 174 

—, Spherical 199f. 

segment (part of disc) 172, 174 

— (— — line) 147, 170, 286f.., 
535f. 

-—~ (— — sphere) 198f. 

— of well-ordered set 330 

self-adjoint transformation 372 

semantics of elementary 
language 335 

semicubical parabola see 
Neil’s parabola 451 

semigroup 349, 679 

semilogarithmic paper 649 

semiparameter of parabola 
183f., 304f., 3/4 

semiregular solid 197 

separable function (optimiza- 
tion) 665 

Separation axiom, Hausdorff’s 
686 

— of real roots 124 

— rule (logic) 337 

— of variables (differential 
equations) 506, 693, 697 

seq function 333 ff. 

sequence 381-388 

sequence, fundamental (Cauchy) 
75, 707 


— of differences 117 

sequential strategy 641 

series 388-396, 479-500 

—, hypergeometric 513 

—, Neumann 705 

set 320ff. 

—, partially ordered 678 

— of expressions, model of 336 

— — points (topology) 680ff. 

— -theoretical foundation of 
mathematics 719f. 

—-— topology 686 

— theory 13, 320-332 

17-gon, regular 163 

sexagesimal system 19 

shadow prices 657, 660 

shear (descriptive geometry) 209 

shift 152, 363 ff., 533 

shortest curve 567 

side conditions 430f., 620, 701 

— elevation (descriptive 
geometry) 210f. 

— of polygon 162f. 

— — quadrilateral 159f. 

— relations in triangle 155 

— of triangle 154f., 262 ff. 

sidereal time 280 

sideways section (surveying) 258 

sieve of Eratosthenes 24 

2-model, structure (logic) 336 

sign change 123 

signature (logic) 334f. 

signed number 28f., 73 

significant digit €09 

similar matrix 376 

similarity 168 ff. 

—, theorems on 170 

simple closed curve 682f. 

— extension of field 350 

— interest 140f. 

simplex method 655f., 727f. 

simply-connected point set 683 

Simpson's rule 202, 636 

sin, sine 220ff. 

sine, hyperbolic 134 

— function 220ff., 236 


— integral 491 

— rule 244, 265 

single-vaJue control chart €06 

— -valued correspondence 107, 
110 

singular conic 302, 557 

— integral SOI f. 

— matrix 374 

— point 433f., 562, 566 

singularity 521 

sinh, hyperbolic sine 134 

sinh~', inverse hyperbolic sine 
135 

sink of field 477 

sinusoidal oscillation 237 

Skew lines 185, 206, 213, 538f. 


Subject index 773 


skew perspective affinity (des- 
criptive geometry) 209 

Skolem’s paradox 720 

slack variable 657 

Slide rule 65-69 

SL(m), special linear group 344 

small circle on sphere 198, 261 f. 

Snell's law of refraction 428 

solid 186ff. 

—, semiregular 197 

— angle 186, 263f. 

— geometry 184-203 

— of revolution (rotation) 201, 
467f. 

solidus 30 

solubility by radicals 102, 354f. 

soluble field 355 

solution, compromise, 728 

— , feasible (optimization) 655 

— of algebraic equation 81 ff., 
352 ff. 

~—- — equations in several 
variables 423f. 

— — inequality 103 ff. 

— — system of linear equa- 
tions 89f., 356ff. 

— —- — — quadratic equations 
103 

solving kernel 704 

Soreau equation 650 

source of field 477 

space 184ff., 530ff., 710 

—,abstract 705 

—, metric 685f., 706 

—, n-dimensional 684 

—, normed 707 

—, projective 548 

—-, topological 686 

— -diagonal 189 

spatial polar coordinates 532 

special linear group SL() 344 

specification of variable 40 

sphere 198 ff., 261 f., 572 

spherical astronomy 276ff. 

— cap 198 

— circle 262, 269 

— coordinates 531 f. 

— digon 198 

-—— excess 263, 266 

— image 570 

— layer 198f. 

— lune 262 

— polar circle 262 

— — coordinates 466, 532f. 

— radius 262, 269 

— sector 199f. 

— segment 198f. 

— surface 198f., 200 

— triangle, surface area 
262-272 

— trigonometry 261-282, 566 

— wedge 198 

— zone 198f. 

spheroid 272 


774 Subject index 


spiral 417, 443 
splitting field 352 ff. 
square (figure) 160, 164 
-- foot 164 
— matrix 373 
- metre (unit of area) 164 
-- mile 164 
- of number 51, 68, 116 
--- root 52f., 62f., 68 
squaring of circle 154, 356 
standard deviation 585, 597, 
6l6f. 
- integrals 454 
- parabola 95f., 116f. 
Star curve (astroid) 441 
static moment 472f. 
Stationary process 594, 668 
Statistical definition of 
probability 581 
- estimation €00f. 
-- quality control 60S5f. 
- testing procedures 601 ff. 
statistics, mathematical 595-605 
-, technological 605 ff. 
steepest descent, method of 
643f. 
Steiner’s theorem 475 
step polygon 444f. 
stepped system of equations 
627f. 
stepping-stone method 658 
stereoscopy 220 
Stirling’s formula 626 
stochastic linear optimization 
661 
- process 594f. 
- search procedure 643 
Stoicheia 711 
Stokes’ theorem 478 
Straight angle 148f. 
strategy, minimax 640 
—, mixed 723 
—, optimal 664, 724f. 
—, overall 640f. 
—, pure 723 
—, sequential 641 
Stream flow, problems of 525 
streamlines 525f. 
stretched ellipsoid 544 
- parabola 118 
strophoid 441 f. 
structure 13, 680, 685f., 689f. 
- constant of algebra 679 
Sturm’s theorem 123f. 
subfield 349 
subgroup 344 ff. 
subsequence 385 
subset 320, 328 
subsidiary vertex 179 
subspace 368 
substitution, integration by 
457f. 
— , linear-equations 89 
- , power series 486 


substitution of recursive function 
341 

subtraction, abbreviated 36 

— of algebraic sums 41 f. 

— — fractions 31f. 

— — integers 21f., 70 

— — rational numbers 73 

subtrahend 21 

successive approximations, 
method of 667 

successor 20, 27, 70, 341 

sum 20, 28, 31f., 40f., 75, 77 

—, derivative of 413 

—, lower 445 

—, partial 389 

—, upper 445 

— of cardinal numbers 328 

—- — events 579 

~ formula, Euler's 625 

— formulae of trigonometry 
235 

— of matrices 373 

— -— probabilities 579 

— — series 389, 394, 485 

— — vectors 363 

summand 20 

summation sign 388f. 

superposition coefficient 
(interpolation) 633 

supplement quadratic 93 

supplementary angles 149f. 

— parallelograms 168 

supply capacity 658 

support plane (optimization) 
655 

— of relation 323 

supremum of class of ordinals 
33] 

-- -— function 445 

surface, algebraic 676 

—, developable 546 

—, first fundamental form 566f. 

—-, genus of 570 

—, ruled 545 

—, second fundamental form 
568 

—, topological properties of 570 

— area 186, 469f. 

-~ in Euclidean space 565-572 

— integral 463, 471 f. 

~- of revolution (rotation) 
201, 570f. 

-- -— second order (quadric) 
543 ff. 

— — sphere 198, 200 

surjective mapping 325 

surveying 25S ff. 

symbol, computational 28 

--, Legendre’s 671 

-- in language 334 

symmetric function 138f. 

— group 344, 354 

symmetric, matrix 377 
relation 323 


—, transformation 372 

symmetry 152f. 

syntax of elementary language 
334 

system of differential equations 
516f. 

— — linear equations 89f., 
357 ff., 644 ff. 
-—— quadratic equations 103 

-- -- Roman numbers 19 
~ sets 321f. 

systematic error 595, 613 


T 


table of signs of trigonometric 
functions 223 

tables, mathematical 51 f., 60f., 
230f., 630 

table of y?-distribution 605 

F-distribution 604 

normal distribution 591 

t-distribution 603 
tacheometric levelling 253 ff. 
tacnode 434 
tally stick 18 
tan, tangent function 222f. 
tangent, tan, trigonometric func- 

tion 222f. 
— tocircle 171 ff., 2701 
— — conic 309f. 
— — curve 562f. 
— — ellipse 181, 209, 309f. 
—- formula 245 
— to hyperbola 182, 309f. 

- -~ parabola 183, 309r. 
tangent plane to sphere 198, 211 
-- — — gurface 565 
— to plane curve 407 
— problem 406, 444 
— to space curve 562f. 
tanh, hyperbolic tangent 134 
tanh~!, inverse hyperbolic 

tangent 135 
Tarski's theorem 721 
Taylor interpolation 633 
— series 488 ff., 495 
(-distribution 602, 603 (table) 
technological statistics 60S5f. 
telegraph equation 696 
10-gon, regular 163 
tensor 380 
term, absolute 86, 92, 97 
-- (logic) 334 
testing procedures €01 ff. 
tests for divisibility 25f. 
tetragon, regular 163 
tetrahedron 193, 196, 202, 244 
Thales’ theorem 158, 166f. 
theodolite 253 
theorema egregium 569 
— fundamentale 671 
theory, formalized (logic) 339 ff. 


third boundary value problem 
697 

thread construction 179, 182, 
183 

three primes theorem 673 

Thue-Siegel-Roth theorem 675 

time 280 ff. 

time zones 28] 

topological circle 689 

— group 348, 680 

-~ properties of surface 570 

— structure 685f. 

topology 680-686 

torsion of curve 564f. 

torus 201, 569, 570 

total buffer time 692 

— differential 421 f. 

— order 72, 324, 329f. 

— probability 581 

tower of fields 355 

TP, trigonometric point 256f., 
Plate 50 

trace (descriptive geometry) 
206, 212 

tractrix 436, 441 

trajectories, orthogonal 437, 504 

transcendental equation 83 

— number 675 

transfinite cardinal numbers 327 

— induction 331 

transformation, affine 534f., 572 

—, inverse 284f., 372, 376ff. 

linear 370ff., 376ff. 

of areas 167f. 

— coordinates 284f., 377, 

533 f. 

— — multiple integral 46Sf. 

— to principal axes 286, 203f., 
379f. 

— by reciprocal radii 527 

transitive relation 323 

translation 152, 285, 363, 533 

— surface 545 

transport problem 658 

transpose of matrix 374 

transversal of triangle 157f., 
298 f. 

transverse Mercator projection 
(surveying) 255 

trapezium 160f., 165f., 452 

trapezoid 160 

trapezoidal rule for integration 
636 


b] 


tree 689 
triangle, nautical 278, 282 
—, Pascal’s 43 


—, plane 154-159, 16Sff., 248f., 
295 ff. 

—, spherical 262—272 

— inequality 106, 368f., 685, 
706 

triangular curve 498 

— framework (surveying) 256f. 


- impulse 498 

— pyramid 194 

triangulation 256 

trigonometric equation 237 ff. 

— function 133, 220-237, 405, 

418, 489 

—, continuity of 405 

-——_ —, derivative of 418 

—, power series of 489 

— point (TP) 256f., Plate 50 

series 496 ff. 

— solution of cubic equation 
98 f. 

trigonometry 220ff., 241 ff., 
261 ff., 566 

trihedron, moving 214f. 

—, Pohlke’s 214f. 

trilateration 257 

trillion 19 

trimetry (descriptive geometry) 
215 

trinomial 40 

triple integral 465 

— point 434 

trisection of angle 154, 356 

tropical year 2&0 

true measure from normal pro- 
jection 207f. 

— shape of a plane figure 208 

truncation 609 

truth, concept 720f. 

— function 333 ff. 

— of statement 721 

tubular surface 201 

two circle construction of 
ellipse 179, 210 

— -dimensional integral 463f. 

— -plane method 205 ff. 

— -point equation of line 288, 
537 

— -sheet hyperboloid 547 


U 


umbilic 201, 572 

unbiassed estimate 600 

unconditional probability 580 

undecidable 342 

undetermined coefficients, 
method of 130, 460, 486 

-— multipliers 701 

undirected graph 688 

unequal precision 618 

uniform continuity 403 

— convergence 480f. 

uniformization 527 

union of sets 322f. 

unit circle 150, 223 

— element of group 344 

of length 147 

— matrix 374f. 

— segment 147, 283f. 

— theorem, Dirichlet’s 674 


Subject index 775 


— vector 363f. 

unity, roots of 674 

universal covering surface 527 
Universal Time 281 
universally valid 82, 104, 336 
universe, Cantor's 718 

upper bound 610 

— sum 445 


Vv 


valency 334 

valid digit 36, 609 

— expression 336 

validity of statement 721 

value, approximate 608 

— of function 108 

vanishing plane 204 

— point 204, 217 

variable 40ff., 81 ff., 109, 333 

variables, separation of (differ- 
ential equation) 506, 693, 
697 

variance 585, 614 

— of binomial distribution 
587f. 

— — normal distribution 590 

— — Poisson distribution 589 

— — sample 597 

variation of parameter £07 

— problems with side condi- 
tions 701 

— (statistics) 597 

—s, calculus of 698-702 

variety, algebraic 675f. 

vector, feasible (optimization) 

655 

algebra 362 ff. 

— analysis 475-479 

— equation of curve 561 f. 

— field 475 ff. 

space 362-369 

vel function 333 ff. 

velocity field 477 

vernal equinox 278f. 

vertex of ellipse 179f., 305f. 

-— equation of conic 314f. 

vertex of feasible region 655 

- — hyperbola 181 f., 307f. 

— — parabola 183, 304f. 

— — projection 203f. 

— -- quadric 545f. 

- — solid 186 

— — triangle 154 

vertical (astronomy) 277 

vertically opposite angles 150 

Vieta’s root theorem 102, 138 

volume integral 463 

— of solid 186ff., 467f. 

—_ — sphere 199 

vulgar fraction 31 


776 Subject index 


Ww 


Wallis’ product formula 456 

Waring’s problem 673 

wave equation 696 

Weber-Fechner law 59 

webs, geometry of 572 

wedge 203, Plate 22 

—, spherical 198 

Weierstrass, theorem on con- 
tinuous functions 406 

— majorant criterion 480f. 

— normal form of elliptic 
integral 529 

— -function 528f. 


weight of measurement 616 
weighted mean 619 
well-ordered set 329f. 
well-ordering theorem 331 
winding number 521 

Wolfe, method of 663 

word length 632 

work integral 472 

Wronskian determinant 509, 511 


Y 


yard (unit of length) 147 
year, tropical 280 


Z 


Z, integer 27, 73f. 

zenith 276f. 

— distance 277f. 

zero 18, 23, 27 

—, generic 677 

— direction in polar system 284 

— matrix 373 

— point of coordinate system 
283f. 

— polynomial 350 

— of polynomial 121f. 

— — rational function 126f. 

— vector 364 

zeta-function, Riemann’s 673 

zone, spherical 198f. 

Zoutendijk’s method 663 

Zhukovskti profile 526 


CELEBRAZIONI ARCHIMEDEE 


DEL SECOLO Xx" 


11-16 APRILE 1961 


1 Archimedes Poster of the town of Syracuse (Italy) on the occasion of the commemorating 
conference in honour of the ancient Greek mathematician Archimedes, which was held from 
11 to 16 April 1961 and was attended by mathematicians, physicists, and engineers from all 
over the world 


28 


‘ y BY. | TTT Per : 
| AGB ERER PREG 
- , 


Ha a 
of } , 
a aw 


_ A> a a a ae 


2 Mathematics in school | Introduction of the number seven 


4 Mathematics in industrial arts Surfaces of revolution in the design of a pottery set 


5 Drawing instruments | Geometry sets 


ee aa amir aot 1 I a Pa pee poms tA epepoemay a. Tt LLL aa Uhlan 


Bh J aera atonal lili hi NO OUUUE 


& 


Rin ' 


stout telatutaty anu ty 


iL aca ka ape gaya wl 


ili es 
EM aee ete ey 
7 


13 Z a a / 
ii AL evga tage taerueynunpnaigd VLA L ATUL U TT 


vy. Ps ae 1 


juin iii ion hy 


ah 9 Mt b Minit li 'y, li ye a oda ye av ty, ‘: ily a" ila ai li he By ire A int Min 


ee A Pe eres | canst h an i et bel oF... sbete i oe dele ie nite ik nf <7 5in 
UL a eon jee bagvewnys i i eye uy fe, peoepen rae perdi i | pete 

1? 11 1é 5 16 17 iT F | : . =“ 
au gi iii Has sna binant bantu ven funnafbitinen nny si 

. . 
| | 

in| Jidifi finders li tenn Aili ane ta edit i fi 
' i a i piawe hin i howe i nf alunvetins ie aif aibass ivory cloth: beds 
F peed es Co Eee ed is? al ree rye vfs «i ton 


bie bel a ee anape ae ist day pead 


To 


SS 


ref iON I ik ae anne a i 


——_— 


iia item isthe ~ site pom emp ea et ee eT ed Le ES a Ue me Leet Ld 


jt aa i Dita 
LT 


1a Le 


yey TVS ye Te ye 


3274 


REISS 


] F| 
L An ile vilvinelaiiel evel ivvellontstion slut ma. ere pwned ssdenesl tensile uilue ven ee ee ee — rie ber Avvalrmlenshencdone 
K i‘ Hap eid ep nd eapa tingid ‘nema a aii yan ae rq es aay Tt qe et re jaa iT vain a ago FS on 
4 ‘, n 5 
z SSCA SSA 8101 Useatatalattt re tin iol jn TU LL hn 
ait te 
IBS - Darmstadt Record 

: i a le Conant in vue ida Souder atte TG ee 
LL yor , tin DepvolacnsQeaanls auf eleioiMintel rasedie ery | snelvrantesnitinpthen ae. " eemeniiiied 
LL; x o peeerpeeenga Snpodeepuenepteadyensayar nea ret se a “vm Lege maj ee PULL. ei m Hes halla - a om rere mr ma 
bia Tanwad Pe Ae elopaed pave ietibes ee. Debolabalalelads . ee ae wali ‘4 ae ut) ilaaaabion "fi sfinsafeecelatits veel vevsteoniTenvleiad v4 


z ‘ 
aoe ek oe a ee ee ae a ee ee Ye ee ee eee 
2 Ss en ee ‘ 
er " ae Tate age Mea, 90 a 
# 
ic TTD oa TORU olla TUL i mn he if 
9 20 10 ad 40 60 bf) oo 2 wo SF 
Ld 
jn 4040000] NHMO EASA UUOUY UA nei UO UUUUOOOUUUUODUUUULL Mcnininn : 
om 3 " 
sombdeet inched ‘died lqn 
Donal tesa gf Sea si ie << a 
di dae aol rt mpg spree sie hay ea 5 5 a Oe as alas ake a Feet 
1 a i a 
fifi inion vain bei it TOO PD no 
i 


[ Ta me oe ili it se: nee 


Oo 


x 
halt wennvitiagity fit ryeetyiae i | eoduinbeanelli ietatiedetstallel 1 tystabed vbanatetalieknetaltetstia 


6 Drawing instruments II Slide rules 


tlabidelal 


o 
inate btl a odibbed ude tite ial 


ere rer 
j Tit 


ill lit 
pny 


4 


a P 


& 


Wty li ii ft, 
| mM Mig : 


# Ay 
‘ His 


Pin i 7 . = } 
oe = a. a 
z po fa * 
ae ci J — - fr 
, r - : = + - oi ‘a 
f >, vé = :. J 
‘oF 3 é i} j ’ 
Re, af 2 th f Tr i 
OP Ae * t2 f 4 et ' : {i ii iy 
At - y } i roy gta ri 
st 7? gl 7 4 31 
re i i ea AS 
Hy ip 7 1h haa 
wy j ‘al 1" 
Ue Fa, ‘ wh hut 
eee ee i \ 7 i 
ae 


7 Drawing instruments III] Rulers, protractors, and french curves 


4 
6 
8 Graph papers | Millimetre paper 2 Doubly logarithmic paper 
3 Simply logarithmic paper 4 Polar coordinate paper 


5 Triangular net paper 6 Probability paper 


above and right 
Clay vessels of the new stone age with 
mathematical ornamentations 


9 From the earliest period of mathematics 


below 
Early Egyptian surveying (about 1415 B. C.; 
painting on plaster) 


de fei 
nr rae 


. fens, ice wo ml , 
3! oa Wie Se et Q 
Te Bl Ko AK we 


Seeoshideor Sse, 


NASA Wis SR a ae = 
baeote 


Original text of the Hau problem in Demotic 
writing 

centre 

Transcription of the same text into hierogly- 
phics 

Calculation of a frustum of a pyramid 


ry Tire 
1 | | 
F ! ‘ 
i : | i| 
J 4 t / 4 
' 49 Pe Re ee ih 
a 4 ; 
| oe 6 PT. 1 | 
r \ | 2 i * V4 
ii a f i | ‘ ‘ i t i f 
. “y if a | = 
] ‘ | 
| | iT J j wh 
| 1 ) | oe * 7 | ( ) 7 é 
fi Li ‘é i LF i 7 
F r 


10 Ancient Egyptian mathematics 
(The Moscow papyrus) 


i! 


yt a 
| ft ul 


We sig tess os 


‘| 2 ee ii = ; 
Ag a = Flos +5 tT | . ol =a 
| hey, ‘ r i ’ = 


Pt 
r Fs 


— 


| 
Sis parry pei 
AST ao fa rem Cure 


Cuneiform tablet with calculations of areas 


11 Babylonian mathematics 


Section of the tablet above 


LD 
2 


caafhinl:in incipi : De prnopi|s ple notva:¢ pmo oe biffinL” 


|| @Ungulue planus e Ouard linearu al, 
. : tcTTIUD Stactus:quay erpalio ¢lup lup- 
heéapplicatiogsno birecta. @ L2uado aurangulun atinet Due 
} linge recte rectiline* anquive noiar. @ £oni rectalinea fupreca 
fererit bnog3angali vrrobig3 hicrir eqles:coxy viergsrecrcrit 
C2ineagslinee {upitas ei cul phat ppendicutaris vocat.d An 
qulna Yo qm recto migio2 é ebrulus Dicit-d Angul’yo mino2 re 
cto acur“appellat.a Lermin’é qo vniulcowigs hnisé GQ Sigura 
¢qtmino vitermis onnect. Ge ircul”é hgura plana ya qaem Li 
nea stet2: q circiifergnnie noistun cutmecho pict’ €: a quoots 
linee rectead circiiferena excites fibiiowes tut cquales. 2 bic 
quidé picr*cérrd circali d¢.0 Diameter circuli ¢ linea recea que 
lupeciicentr tratiens extrenutatelq; luas etrcuterenie applicans 
civevla i duo media dinidit. Gd >emarculus é haure plana ora- 
metro circuli amedictate carcuferantic enta.d Doze circus 
Ivé hgura plana recta linca ¢ parte circuferénie pteca: lemicirca / 
lo quidé aut maioz aut ming. C Reerillnce figure lit recno ly 
eis COnnené quart queda trilacerc q mb"reens lincis: quedi 
quadrilatere q qmoarecns tines, Gada minlarcre qué plunbus 
} Qaquatuo? recis liners connnenf. C Sigurarn crilarcrard:alia 
eft wiangulus brs mra latcra eqnaha. Alia mangulas duo bis 
cilia latcra. Alia mangalas mw incqaaliam tater. Daz terd 
aha cf omrbogomi:ynu.! rectum anguliin babes. Alia ¢ am: 
bligonuim aliquem obruium angulum habene. Bla eft omgont 
wun qua tres angel iunracen A 41gurard aure quadrilatcrar 
Aliett oo quod ct cquilatens atq5rectanguld. Aha ett 
rerragon“long?;q cit figura rectangula sicd equilatcranon eft, 
Alia cil balmuaym: que cit cquilatera : ted rectangula non cit. 


The Elements of Euclid, 
first printed edition 1482 


12 Graeco-Roman mathematics 


Roman hand abacus 


13 Ancient Chinese 
mathematics 


AQE See 
20 © G4 Gr OE 
AR 

From a manuscript dated 


1303, the triangle of num- 
bers later named after Pascal 


Bamboo digi TT ae Gee Dee Te TT 
ons LADUE 
Tens i —— a au aR — 

Chinese slide rule Hundreds like Units etc. 


(about 1600) 


Mathematical-astronomical buildings of the 17th century for the determination of 
time, declination, and hour angle (near Delhi) 


14 Ancient Hindu mathematics 


Mathematical manuscript of the 16th century (copy of a manuscript of Bhaskara) 


15 Arabic mathematics 


Hors sei 4) eI! 9 Least 
eee aL. ee all 


i rere ates ey ae 


LES ped ee Ws Usp LG wr 7? 


Theoren of EY iiasorts in an uy) Ae di oy, Ap bs ke FSS 
rabic mathematical manu- 
script of the 14th century ed lh eS at ee | TT 


Arabic astrolabe for the meas- 
urement of heights and for 
astronomical calculations, be- 

ginning of the 15th century 


16 Mathematics in Europe, 15th 
to 17th century 
Triumph of the modern 
algorithm (digital calcula- 
tion) over the ancient coun- 
ter reckoning (abacus). Con- 
temporary illustration dated 
1504. Pythagoras, reputed to 
be the inventor of the aba- 
cus, sits sulkily on the right 
over his calculations, while 
on the left Boethius, regard- 
ed as the inventor of written 
caiculations with Arabic nu- 
merals, is already finished. 
The goddess Arithmetica in 
the background supervises 
the contest 


The use of Jacob’s staff. 
Contemporary illustration of 
the 16th century 


Ancient Egyptian mural: catching fish and hunting birds in a papyrus thicket 


Projections in plan of characteristic cross-sections, see the position of 
head and shoulders of the principal persons 


17 Mathematics and the 
visual arts I 


Examples of the 
representation of spatial objects 
in a plane 


Painting by Melozzo da For- 
li (1438-1494): Pope Sixtus 
IV appoints Platina as Pre- 
fect of the Vatican (Rome, 
Museum of the Vatican). 
The discovery of the vanish- 
ing point at the beginning of 
the 15th century marks a 
turning point in the history 

of painting 


ae 


ile 


{ 


yaar 
oo § <_ 
ee * te. 


Rea y. 


,| \ " 
> pea _ 


a 
a | 
‘ " . 4 * = 
: . =k @ - z * 
+ Speen 7h ew a] *t 
a. ? - r = : , 
, - a 4 
ong}! ald Raat Be 
. jal ee pee 
._' —- e 2 Ae * 
a * =f =o ote =e 
i ys, 7) 
" 2 = _ 
i. _ = “* =, = 
> ae rs et 
4 Oe . =ee | = = af 
= ; cr 4 
7: = te = =< .* ay 
Wh teat > bie 
, = ge nea wa 
era 
a eh iy = | 7 
yr i-, e" uk te = 
7 or i i ” é at i ey 


Drawing by Leonardo da Vinci 


18 Mathematics and the visual arts IT 
Proportions of the human body 


Scheme of proportions. Sketch by 
Albrecht Direr 


Ys a 


hl — “ — 


19 Mathematics and the visual arts III 


Melancholia, copper engraving by Albrecht Diirer, with magic square; 
made in 1514 (see last row of square) 


Egyptian pyramids near Giza (rigth square 
pyramid) 


20 Geometric forms in architecture and techno- 
logy I 


Tower of city walls (rigth circular cylinder and 
cone) 


The old town hall of Leipzig, 16th century. The 
tower divides the facade in the ratio of the 
golden section 


“aye Ee oe al 
r a> a x = 
a y a ani cr a id manana anand 


ee ei, SA eden eal! 


—_—— 


21 Geometric forms in architecture 
and technology II 


—— 
_—— i 


ee | i | 


Modern water tower; its conical 
shape presents an unusual 


sight 


Cooling towers of a generating 
plant (hyperboloids of revolu- 


tion) 


: 1, “i 


i: 


ML 


at, & q 4 = 


A | a 


4 sere = ™ ae hi SR 


- e E ar onr 


\ + a= 


hh 2M | 
Hae 


—e, + 
. 


2 
w F et bs 
——. 


Obelisk in the great temple of Amun Wedge as a cleaving tool 
at Karnak (Ancient Egypt) 


22 Geometric forms in architecture and technology III 


Hyperbolic paraboloid shells as roofs of an exhibition hall 


23 Famous mathematicians of the 
15th/16th century 


1 Johannes Regiomontanus (1436 to 
1476) 

2 Simon Stevin (1548-1620) 

3 Albrecht Durer (1471-1528) 
(detail of his self-portrait) 

4 Niccolo Tartaglia (about 1500-1557) 

5 Geronimo Cardano (1501-1576) 

6 Jost Burgi (1552-1632) 

7 Luca Pacioli (1445-1514) 
(after a painting by Jacopo 
de’Barbari) 


echenung nach oer 


lenge/ auffbden Linihen 
pnd Feder. 


Darju forteil ond behendiafcit durch die Droportioz 
nes /Practica genant/ONit arintlichem 
onterricht des vifierens, 


Durch Adam Niefen, 


im 15 5 0. Yar. 


Cum gratia & priuilegio 
Cxfareo, 


Adam Rifen. 


Dihefauff. 


Stem/cinet hat 100. fe. dafitr wil er r00. 
haupe Bihes tauffen / nemlidy » Ochfens 
Sehrvein/ KAlber/ ond Geijfery fo(t cin Ochs 
4 ft. cin Gehtocin anderthalben f. cin Kalb 
einen balben fj. ond cin Geig cin ort von cinemt 
fr. rie viel fol er jeglicher haben fiird{e10 0. fr? 
Macs nach den vorigen/madh cines jealicens 
foften su orrern/defigleidhen dig io o. fr. pnd (ea 
alé dann aljo: 


100 400 


24 


The whetitone 
of Witte, 


twhiche is the feconde parte of 
Arithmetike:contafnpng thertrace 
tion of Rootes: Lhe Cofike pracife, 
with the rule of Eguation:and 
the woozkes of Surde 
P.. ei Nombers. >. 


= ive mi ze 

bas rt b raany [tomes doe beare greate p ae, 
: : reate price 

The ‘sticilone is for exerfice . 

As meadefull 


( Whele Wookes are to bee folve, at 
the GA cike booze of Poules, 
by {hon Byngionc. 


right 
Title page of Robert Recorde’s ‘Algebra’ of 1557 
(by courtesy of the Trustees of the British Mu- 
seum) 


top left 

Title page of Adam Ries’s ‘Rechnung auff der 
Linihen und Feder ...’ of 1550. The words refer 
to calculations by means of the abacus and on 
paper 


Famous mathematicians of the 16th century 


below 
A problem out of this book concerning the 
purchase of livestock 


afl 


tna 
5, 
a 4) 


a TLL! 


ri 


i A bs 


1a Sy 


Th 


‘> 


Ch bitillih ae 


Conclusion of a_ business 
deal at a calculating desk on 
which lines and a distribu- 
tion of coins are marked 

(old woodcut) 


25 From old arithmetic books 


Calculation of the capacity 
of a cask; title page of a 
booklet by Johann Frey 
printed in Nuremberg in 

1531 


The mathematics room of the National and 
University library in Prague 


26 Two libraries 


Entrance to the Science library of Erfurt 
(Boyneburg portal) 


27 Old mathematical aids I i 4 


Pedometer, 1741 


Slit bamboo as _ counting 
stick (from Sumatra) 


below 

Tally stick. On conclusion of 
a business deal both parts 
of the tally stick were 
marked at the appropriate 
notches. Each partner re- 
ceived one half as _ legal 
evidence 


* ee . 
ee 
=— —_— ae aed eee f= = = -_ r= ff -_ ro 
ee eee 


28 Old mathematical aids II 
Counters or markers for arithmetic (not money) and an elaborate box, 16th century 


Surveyor’s compass, for the measurement and graphical determination of distances and 


angles in the field, about 1600 


29 Old mathematical aids III 


Illustration of a rod, by juxtaposition of 16 feet. The 19th century English rod 
(also perch or pole) is 161/, feet long. From Jacob K6ébel’s ‘Geometrie’, Frankfurt 1616 


30 Old measures I 


16th century measuring rods with various graduations 


ivory 


‘| 


Hinged sun dial 


31 Old measures II 


set of weigths, Nuremberg 1588 


DISCOURS 


DE LA METHODE 


Pour bien conduire fa raifon,& chercher 
la verité dans les {ciences. 
PLus 


LA DIOPTRIQVE. 


LES METEORES. 
ET 
LA GEOMETRIE. 


Qui font des effais de cete METHODE. 


A LeEypDeE 
De ['Imprimeriede LAN MAtReE. 
cIla Id ¢ xxxrvtl. 


Aec Prinilege. 


René Descartes (1596-1650) 


32 Famous mathematicians of the 
17th century I 


Title page of Descartes’ ‘ Dis- 
cours de la méthode’ whose 
third part ‘La géométrie’ con- 
tains the foundations of analy- 
tic geometry 


33 Famous mathematicians of the 17th century II 


1 Francois Vieta (Viéte; 1540-1603) 
2 John Napier (Neper; 1550-1617) 

3 Galileo Galilei (1564-1642) 

4 Johannes Kepler (1571-1630) 

5 Bonaventura Cavalieri (1598-1647) 
6 Pierre de Fermat (1601-1665) 

7 James Gregory (1638-1675) 


top left 
Gottfried Wilhelm Leibniz (1646-1716) 


above 
Blaise Pascal (1623-1662) 


34 Famous mathematicians of the 17th/18th century I 


The infinitesimal calculus was invented inde- 
pendently by Leibniz and Newton 


Isaac Newton (1643-1727) 


oe Va, nl poe) fa : 
“sok eo he 
‘ll pp pters J <- 
S eaks - one — ‘ nae” 
Pan Ae er end 
Le / i = ow 
3 ree ert ae inks heyy FT RLE : APL 
| = eee orn. wt ion frov moniter (. 


y mpl - ff 4 aoe: 


eon f—p' a ha toate 2, A he kl a ag 
re | itn n= * a ‘ae i 
¢ En* 
=e iteten mY és ha Kone b 
aN ei d. ‘ cf oe Fall Teh. 
: tay bales aires 
ind, ig A ate, Ae ‘a Seige? ae che 
as 7h, be na eae = popen! fate 


dings yey td fh a Fs Pid isla at + ny! 
| | ie JO a x ge rl “ra Ce IP 
ne +3, iL? raps "i on 
peapeied Fis Sc 3 eek dala Q oie yh erie Ope ow bin. 
a. ae) - re > Tanda fa sla "teas Sif : : We. 9 Re { oe : mm a3 JA Yaad ~E 
e ma. : Pe, Gian. ta Ney tt fo sayerd ed mhng 
3 cit, 7 5 wee a 7 ie > HG CY | % a : ere, ni — f fa ae iy 


Extract from a manuscript of Leibniz dated 29 October 1673, in which the integral sign 
appears for the first time 


35 Famous mathematicians of the 17th/18th century II 


The mechanical calculator constructed by Pascal in 1642 


top left 
Jakob Bernoulli (1654-1705) 


above 
Johann Bernoulli (1667-1748) 


36 Famous mathematicians of the 17th/18th 
century ITI 


The Swiss family of scholars produced 
within three generations eight eminent 
mathematicians. The three most important 
are Jakob, his brother Johann, and Daniel, 
the son of the latter 


Daniel Bernoulli (1700-1782) 


Dik wis nee fre 7 =a gin 
as anges +d eofe gotaratix, fev rad rh, 
oth = 


’ Lo t4 
tree Alt - prctdinh bid PF 


: + Fcetece) 
: cot Gl 4El EAT 
. oMneé Att ete 284 pte+] 


ae Bee SVBEEH! Cuare Aa « Lay 
= / adn ~ + Voeed/ esl guacst 

Lj aypusrasir Ff I “a 

id wade § ¢ lated tet = VEr : 
Wa alox t oud ays! Paix )+ate4 i" 


Page from a manuscript by Euler 


_ 37 Famous mathematicians of the 18th century I 


Leonhard Euler (1707-1783) 


38 Famous mathematicians of the 18th century II 


1 Brook Taylor (1685-1731) 

2 Moreau Maupertuis (1698-1759) 

3 Johann Heinrich Lambert (1728-1777) 

4 Joseph Louis Lagrange (1736-1813) 

5 Gaspard Monge (1746-1818) 

6 Adrien Marie Legendre (1752-1833) 

7 Jean Baptiste Joseph de Fourier (1768-1830) 


tn of? eh ch “Sa ie ge 
Pak plier yor) Pioe 2 
=, safe Mee shale 5 GEL 
vaniene Hite ayp ie) 7 @ 


na = 


Drawing by 
Janos Bolyai 

on non-Euclidean 
geometry (1820) 


39 Famous mathematicians of the 19th 
century I 


Nikolai Ivanovich Lobachevskii 
(1792-1856) 

He developed independently and 
almost simultanuously a non-Eucli- 
dean (hyperbolic) geometry 


Gauss’s signature 


' 


THe 


eis | 
Sous iy i (a a | 


ana 


left Portrait of the young Gauss 
right Gauss in his old age 


40 Famous mathematicians of the 19th century II 
One of the greatest mathematicians of all times was 
Carl Friedrich Gauss, born 30 April 1777 in Braun- 
schweig, died 23 February 1855 in Gottingen 


The University in Gottingen, where Gauss worked 
for nearly fifty years 


1796 


fepdemdeciomn pares && Per Bresar. 


Marske non: v2 PRO UAN: nen yennt) 
asmecss crfrn taloys ve Laks. yen nhcem 


Tornula (aes costnibus on oum prtiphe 
Tie EZ ¢ Lhe} a [TLE wih C4 Lens . 


ob rnonrn nn nigh ls . | a 
ie | 
1 (Usury wrgihy bus WONG Ww ati is 
J Mac. jh. Cah 
1 Lefficecales vito "e ee eid pale ot a 
y alas le dante Mar. 93 Goh.’ 


f +o? 
ree Ms .2é, (/ 


+i 


et io i tap Loy ere 
Ae 7* — - t+ 128 
(get 
'/- L_ 6 
P+. 
Pie 0% 
bg 24 


41 Famous mathematicians of the 19th century III 
A page from Gauss’s scientific diary (30 March to 24 May 1796) 


42 Famous mathematicians of the 19th century IV 


1 Friedrich Wilhelm Bessel (1784-1846) 

2 Augustin Louis Cauchy (1789-1857) 

3 Jakob Steiner (1796-1863) 

4 Niels Henrik Abel (1802-1829) 

5 Peter Gustav Lejeune Dirichlet (1805-1859) 
6 Evariste Galois (1811-1832) 

7 Pafnuti Lvovich Chebyshev (1821-1894) 


Kua = | 


43 Famous mathematicians of the 19th century V 


1 Carl Gustav Jacob Jacobi (1804-1851) 
2 Bernhard Riemann (1826-1866) 

3 Leopold Kronecker (1823-1891) 

4 Karl WeierstraB (1815-1897) 

5 Arthur Cayley (1821-1895) 

6 Sophus Lie (1842-1899) 

7 Sonya Kovalevski (1850-1891) 


Instrument for drawing an integral curve of a given function or differential equation 
44 Mathematical instruments I 


Instrument to evaluate the integral of a function whose graph js given 


moving arm 


Compensating polar planimeter with polar am, for the measurement of areas in maps and plans; 
the moving point is equipped with a magnifying glass 


45 Mathematical instruments II 


Compensating polar planimeter with polar carriage for the evaluation of strip diagrams 


46 Mathematical instruments Il 


Instrument for measurement of rectangular coordinates or the drawing of points with given 
coordinates 


ae RH re ee ee ee ee 
ta meme wns rine iat ms Lass nell pt Perea aw carla peregrine abv bam Pen is sg a 


my 4 
* 


47 Mathematical instruments IV 


Harmonic analyser, to determine 
the Fourier expansion of a periodic 
function 


Instrument to determine the tangent 
or normal to a curve whose graph 
is given 


48 


Famous mathematicians of the 19th/20th 
century I 


1 George Stokes (1819—1903) 

2 Richard Dedekind (1831—1916) 
3 Georg Frobenius (1849—1917) 
4 Georg Cantor (1845—1918) 

5 Henri Poincaré (1854—1912) 

6 Felix Klein (1849—1925) 

7 Emmy Noether (1882—1935) 


48 Famous mathematicians of the 19th/20th century II 


1 David Hilbert (1862—1943) 

2 Ehe Cartan (1869—1951) 

3 Henri Léon Lebesgue (1875—1941) 

4 John von Neumann (1903—1957) 

5 Hermann Wey! (1885—1955) 

6 Jacques Salomon Hadamard (1865—1963) 
7 Stefan Banach (1892—1945) 


Nw 


Fe 


Teh 


RNS 


Sa 


A 


ap i 
ip b, * , ie, ; we: EF ~- 
she." 4207878" ie ei . . a 


top and top right 
Signal for the observation of trigonometric nets 


50 Surveying 


Trigonometric point (TP) 


51 Mathematical education I 
Practical work of school children 


Work on a wall board 


Determination of an angle with a 
hand-made apparatus 


Giant slide rule for instructional 
purposes 


Geometrical constructions on the blackboard 


52 Mathematical education I 


Computations on part of an exhaust system 


Application of Pythagoras’ theorem 


Se es =F 


== ap ig 
—rryT } 


= _ ee ry = = 


hes sieat Spe, 


oi . . 1 oe : . 7 = he 


: ees 
: a a 

= — le 2 + ak . . : 

a gs > sim ss Th" ki. ¢ a 4 : ; 
: ee ‘ - — *. a _ ies a a ee 
ee ee a! 223) 2 ele 
Fe ee ay CI at |. es = 
—* bes * _* : = 


=! 8 el 


53 Mathematical education III 
Models for pupils 


1 Cube with surface and space diagonals 

2 Prism decomposable into three pyramids 
of equal volume 

3 Cylinder with sections 

4 Sphere with plane sections 

5 Sections of a right circular cone 


(All models are made of plastic) 


Negative and positive of a photograph 


54 Mirror images 


Reflection in water (gas holder) Ship’s Diesel engine in a left- and right-hand version 


55 Variational problems 


Formation of a minimal surface in 
a lobster pot 


if 
fe 


i - 


Formation of a minimal surface by 
a soap film 


The path of the light ray from A to 
B is the solution of a minimal pro- 
blem 


56 Mathematical models 


Moebius strip, a one-sided surface 


A closed surface of genus |, like the 
torus 


bottom left 
Pseudosphere, the simplest surface 
of constant negative curvature 


bottom 

Surface representing the modulus 
of the function w = exp (1/z) 

near the point zp = 0 


Mathematical symbols 


Symbol 


PR-ee 
II 2 3 
& 

g 


> oP AS 


arcsin = 
sin! etc. 
sinh etc. 


V AVA IP il tl 


| + 


Explanation 


similar 

congruent 

triangle; e.g. A ABC 

parallel 

not parallel 

perpendicular, at right angles 
angle; e.g. { ABC 

degree; e.g. 90° is a right angle 
minute, 60’ = 1° 

second, 60’ = I’ 


Gon; e.g. 100 gon is a right angle 


arc AB 
arc « 
segment AB 


directed segment, from A to B 
sine 

cosine 

tangent 

cotangent 

inverse sine etc. 


hyperbolic sine etc. 


Arithmetic, algebra 


equal 
identical 
corresponds to; e.g. 100® + 90° 
unequal 
smaller than; e.g.a< b 
greater than;e.g.b >a 
smaller than or equal to; e.g. 
a < 0, not positive 
greater than or equal to; e.g. 
a => 0, not negative 
plus (sign, operational symbol) 
minus (sign, operational symbol) 
times; e.g. 3-4,3 x 4 
3 


divided by; e.g. 3: 4, 2/3, 7 
a divides 5, e.g. 3 | 12 
3 


sum; e.g. > a; = a; +a2+ a; 
i=l 


3 
product; e.g. J] a; = a, -a2° az 
i=1 


a to the nth; e.g.a7 = a-a-a 
square root of a 


nth root of a 


n factorial; e.g. 3! = 1-2-3 
nover k;e (3) _ 6°54 
SB \3) T0738 


modulus or absolute value of a; 
e.g. |—7| = 7 
logarithm to the basis b 


decadic or common logarithm, 
basis 10 

natural logarithm, basis e 

2.718... 

imaginary unit, i2 = —1 

scalar product of two vectors 

vector product of two vectors 


i 
a- b, (a, b) 
a x b, 


[a, 5] 

(a,,) = A | matrix with the elements a,, 

lax determinant of a square matrix A 
= det A 

=(modm){ congruent mod m, e.g. 


12 = 7 (mod 5) 


Analysis 


(a, b) open interval a< x <b 


[a, b] closed interval a< x < b 
fore) infinity 
nr pi (= 3.14159...) 
> tends to, converges to 
lim limit 
~) approximately equal 
d symbol of differentiation 
{yy | dy by dx; differential quotient, 
dx derivative 
n 


nth derivative 


symbol of partial differentiation 
V nabla operator 

A Laplace operator 

of delta f, variation of f 

Jf f(x) dx | indefinite integral 
b 


f f(x) dx | definite integral 


Set theory 


= element of, member of; e.g. 

a é€ {a, b} 
¢ not element of; e.g. c ¢ {a, b} 
Sc subset of, contained in 
Cc proper subset, e.g. {a} C {a, b} 
Vv union, e.g. {a} » {b} = {a, b} 
a intersection, e.g. {a, b} ~ {b, c} 


empty set 


non (negation) 
and (conjunction) 

or (disjunction) 

if— then (implication) 

if and only if (equivalence) 

there exists (existential quantifier) 
for all (universal quantifier) 


